Deep Learing in Virtual Data Center Management

Bookmark and Share

During VMworld San Francisco I had the opportunity as a VMware vExpert to attend a special breakfast that SIOS put on and which my friend David Klee was speaking at. This was a pretty cool session about how SIOS was using data analytics to look at the health of the data center.

SIOS Technology Corp
SIOS Logo
Website: http://us.sios.com/
About: http://us.sios.com/about-us/

You’re probably thinking, VMware already has this, it’s called vRealize Operations… You are correct they do, and what if there was something more powerful than Operations? That would be pretty cool, and that’s what SIOS does with their SIOS IQ product. The product uses deep learning to determine normal for a virtual data center.

On the surface this is a pretty darn cool idea. Your DC takes on many characteristics of an organism, and instead of someone having to intemperate whats wrong.  The data center can actually tell you what the problem is and how far the problem spreads. That was my take from the session anyway.

I think that’s pretty darn cool, and that’s enough of the commercial. Lets get to the meat of this post, and if you hadn’t guessed its time for another one of my crazy ideas…

SIOS technology and deep learning is really cool and powerful for a DC admin. Let’s scale this up a notch. When I talked with Sergey A. Razin following the session, I asked him about increasing the power of the solution with GPUs. This creates a very powerful shift in data center management instead of just telling you there is a problem the data center can become predictive and maybe even self healing.

End to end Deep Learning (NVIDIA.com)

How can this be done? instead of single threaded deep learning where we have to process data in a somewhat sequential set of operations lets use GPUs so we can look at data on a much more massive scale. (See: Getting Started with Deep Learning)

We can take the data from the data center (ultimately more than a human could process) and use a deep learning model to process it. This would allow for a more comprehensive analysis of the data center and what could be done to optimize resources as well as prevent DC failures. For example one of the metrics that could be looked at and evaluated is CPU core speed. A fairly innocuous item to monitor in and of its self. Now what if the management system  started noticing that processors in a given host started to get slower, possibly a bad processor. Now lets say that there were minor fluctuations in the processor speed of several CPUs, but they were not significant enough on their own to indicate a problem. A GPU enabled monitoring system, able to look at these variances could come to the conclusion, that there is a cooling/airflow problem and it originates at the server with the server that’s clocking down and spreads out like a bulls-eye. We can then stat to pinpoint problems in the DC and target in on them before they become major catastrophes.

Real Time GPU Operations Analytics

My View of Real Time GPU Operations Analytics

Imagine having deep learning like this where the DC operations system has visibility into both your application workloads and your user workloads (servers and VDI). The load on a VDI system starts to spike, then RAM utilization flares on a SharePoint server. All the sudden the network motoring (NSX) see’s an increase in outbound traffic to a foreign IP and other systems in the environment are starting to peak. If the deep learning operations system has been built to understand things like this. You would be able to shutdown the networks and control the desktop. This would also provide a forensic starting point to look at what has been impacted in the DC. Imagine what something like this could have done for Sony…

All that said. It’s cool to think about how the data center could be changed by the work SIOS is doing. And even further changed into a self healing data center by expanding that with GPU capabilities to process more data quicker and learn more about the DC. This of course would make it so that IT could spend less time monitoring the data center and more time innovating in the data center. Which means for all the business majors out there, increased productivity, better business agility, and faster time to market. All very compelling reasons for the creation of such technology.

I wish I had the time to build on this and create an architecture for something like this that I could share. Unfortunately I don’t have the time or a $20,000 GPU to build a framework to share with everyone. That’s why I shared it with my friends at SIOS and why I’m sharing it with you now. It’s one of my hair brained ideas that someone may be inspired by and create something cool.

I hope you found this post interesting.

Tony

 

Permanent link to this article: https://www.wondernerd.net/deep-learing-in-virtual-data-center-management/

Dynamic HPC Farms

Bookmark and Share

High Performance Compute as a Service (HPCaaS)

The mainframe days of yore, punch cards and time sharing. An ACE station with twin Control Data computers. (NASA photo 107-KSC-67C-919)

Continuing on my random ideas post. Here is another item that can be applied to High Performance Compute (HPC) scenarios. With research institutions and the like it is often necessary to wait to use physical HPC clusters to perform research. (It hearkens back to the days of mainframes and the stories I heard of having to get up at 2am for your slot on the main frame.) What if it could be virtualized and the clusters built out on demand? Maybe a deliver HPC as a Service (HPCaaS)? This would speed delivery time, make HPC resources more accessible, and facilitate more dynamic HPC environments.

Hopefully if you’re this far down you’re thinking HPCaaS would be pretty cool. How would one go about putting this together?

There are a few ways to do this and I’m going to use ArcGIS in my example as its the one I am most familiar with. It also fits nicely into my wheel house.

One way to deliver HPCaaS is to build a cloud service around it using vRealize Automation and Orchestration. For example Enterprise Hybrid Cloud (EHC) which is an offering from a federation of companies, one of which I work for (EMC). This can be a very powerful solution, allowing the deployment and configuration of storage, compute, etc. It can stand up ArcGIS clusters and the other components needed. You can even leverage VMware NSX to dynamically instantiate your load balancers.

This makes it possible to standup and configure clusters like the one below for an ArcGIS deployment. You can standup the NLBs, Web Adapters, cluster A or B, etc. Essentially you build out the components once, create blueprints and service catalog(s) around it. Then enable users to request the clusters they need.

Multiple-machine deployment with GIS server clusters (ArcGIS Pro)

Side note: Now it should be noted that, its often necessary to think about this outside of how virtualization is traditionally thought of. Instead of consolidating and “sharing” performance capabilities of x86 hardware we are just creating a bigger pool of resources. In an HPC environment The goal is not to squeeze every last drop of computing power into as many servers/desktops as possible. Instead its to provide the resources needed. So if its only possible to get a couple of HPC systems on a single ESXi host that should be fine and expected.

Creating HPC clusters this way works pretty well. Only it can become complicated when you start adding requirements like GPU’s to the mix as well as those pesky end user devices. VMware Horizon View can take care of the end user devices. Now what about the GPUs?

You could also build clusters leveraging GPUs with Horizon View. Now it should be noted that “Windows 10, 8.1, 8, and 7 are supported for basic testing and application development use only. They are not recommended for deployment in a production environment.” (see note here) I haven’t tried deploying server based OSs with GPUs from Horizon View. It may or may not work. As for a Desktop OS, deployment is a breeze and it can be incorporated into vRealize Automation and Orchestration. This provides the expanded capabilities of leveraging GPUs for HPC.

How is automating this done? There are two methods to deliver desktop OSs as part of an automation solution. First you can leverage the vRealize Orchestration Plugin for Horizon View. It has several workflows in it and can enable a service catalog for desktops. The second option is to use PowerCLI and create your own set of Cmdlets that you can use with or without vRealize. You can find more about this in the Horizon View documentation here.

This brings together all the parts needed to create a HPCaaS system allowing users to visit through a self service catalog and request a solution that is “tailor made” just for their HPC needs and can be stood up in a fraction of the time that is needed to manually build all of the components.

If you would like to know more about this or have questions please leave a note in the comments section.

Hope you enjoyed this random musing,

Tony

Permanent link to this article: https://www.wondernerd.net/dynamic-hpc-farms/

ArcGIS Virtualization Links

Bookmark and Share

I’ve had some questions about virtualizing ArcGIS and I thought it might be helpful to share the links that I’ve found helpful in architecting around ArcGIS VDI/EUC deployments.

Updated 3-19-16 added a couple of additional links.

Sample ArcGIS rendering (MIT – 11.188: Urban Planning and Social Science Laboratory)

As I’m reviewing my links I’ve discovered that some are no longer relevant or have been updated to include limited information about virtualization and I have decided to exclude them.

All that said here are the links that I am aware of:

I wish the list were longer, if you know of one that I missed please add it to the content section below.

Enjoy,

 

Tony

Permanent link to this article: https://www.wondernerd.net/arcgis-virtualization-links/

GPUs and Detecting Pestilence

Bookmark and Share

This is a project that has been on my mind for about a year now. Chalk this up as another crazy idea that could make someone a lot of money.

About a year ago, it was announced that K-State University (proud alumnus) would be partnering with Australia’s Queensland University of Technology in a really cool effort to use unmanned aerial systems (UASs) to detect pestilence. You can read the press article here: http://www.k-state.edu/today/announcement.php?id=18855

You are probably thinking that’s great Tony how can GPUs, graphical processing units, help with this. That a good question. About a month after this was announced from K-State, was the GPU Tech Conference, where Andrew NG was discussing using deep learning to distinguish pictures of mugs and cats. (http://www.gputechconf.com/highlights/2015-replays).

That’s when I started building on that idea in my head. Here is K-State collecting tons of data on what pestilence looks like from UASs and we are on the cusp of commercializing GPU based deep learning. What if we could do deep learning on all the material that was gathered from the initial study. Then leverage real time or near real time data streams from UASs to detect not only pestilence but all sorts of other agricultural ailments.

This has the potential to exponentially improve the agriculture (and many other industries as well). No longer will it be necessary to treat an entire field for a blight it can be done with precision reducing food costs and increasing productive time for the stewards of the land, farmers.

You may be thinking this is a niche market. This in fact could impact roughly 7 billion people. I don’t know about you but I like to eat, and because I don’t have the space in my back yard to be self sustaining I’m going to be dependent on vegetable farmers, grain farmers, citrus farmers, ranchers, and many other agricultural industries to make food stuffs available for me to purchase either directly or indirectly.  The chart below from the USDA shows how the prices for various agricultural products have changed over the last 30 years.

Wouldn’t it be great if we could leverage technology we already have to help keep food bills lower? I suggest working on a project that leverages GPU and deep learning. How do we do it? I’m happy to make any introductions either in industry or at K-State.

Here is what it would take.

  • HPC cluster with NVIDIA GRID GPUs
  • A moderately sized SAN to store the resulting data
  • Financial backing
  • Integration with the UAS teams collecting the data
  • A field to fly over
  • Development of systems to feed data back for processing
  • Technologist to develop the infrastructure
  • Programmers to develop new learning techniques

The first 3 bullet points have to come from the technology industry. The last 5 points can be driven at the university level with guidance from the industries. You’ll probably notice I said industries… While the technology industry can help create this new paradigm it will need help from agricultural engineers and many others to make it rugged and viable.

I envision a high level design like this:

UAS-GPU-HPC

I don’t know a university that wouldn’t be interested in this project. If you’re in industry (ag, tech, engineering, or other) and are interested in helping with any of the 3 items in the list above let me know.

That’s all for now,

 

The Wondernerd

Permanent link to this article: https://www.wondernerd.net/detectingpestilence/

Do you want to build a snowman??? (virtual or otherwise)

Bookmark and Share

Yes you can bash me now for the reference to Disney’s Frozen. When I wrote this a week ago I was sitting at my desk with the doors and windows open enjoying the 70 degree weather here in Kansas while the East coast was getting pounded by #Snowpocalypse. Now that most everyone hates me for one reason or another let’s get to the meat of this post. 😎FROZEN

I was working on some stuff around Desktop as a Service (DaaS) and got to thinking about #Snowpocalypse in the northeast this last week. It hit me how valuable a DaaS solution would be to many organizations affected by the blizzard. I’m not talking about just a VDI solution I’m talking about cloud desktops also known as VMware Horizon Air.

What’s so great about cloud desktops? I can do it all with VDI in my own datacenter without relying on a service provider. You’re right you can! You can even run VMware Horizon DaaS in your datacenter but there is more to it than just that.

What got me thinking about how VMware Horizon Air was great for the Snowpocalpyse was all the IT staff who are “just hanging out” at the datacenter to insure uptime. Maybe they don’t mind it and maybe they live for making sure the cutover to generator power goes smoothly when the power goes out. Not to mention all of the workers snowed in at home and want to work but they can’t because they can’t connect to the datacenter.

DaaS Model

VMware Horizon DaaS Model

This is where VMware Horizon Air really starts coming in handy for businesses large and small. You can provide desktops to employees, even those of us in the middle of Kansas, and when Snowpocalypse happens move those users to other datacenters around the country. Then it doesn’t matter if the power goes down, the communication lines drop to the datacenter in the affected area, or the IT staff is unable to get to work.

What makes VMware Horizon Air even better is its elastic. So maybe all of your desktops normally run on prem. Heck maybe most of your organization still uses physical desktops. When something crazy like Snowpocalypse happens and the datacenter doesn’t have the capacity to provide everyone a desktop. You can grow out into the cloud, and still give everyone a desktop enabling them to get work done.

Wouldn’t it be even better if those desktops ran like they had an SSD drive backing them? But who wants to pay for expensive SSD? At VMware PEX, EMC said they plan to offer a desktop as a service reference architecture (RA) on the EMC* XtremIO All-Flash array. The reference architecture supports Starter X-Bricks scaling up to six full X-Bricks in a cluster. (The X-Brick is the storage building block for XtremIO.) The Starter X-Bricks can support up to 1250 full-clone virtual desktops and the full X-Bricks can support up to 2500 full-clone virtual desktops. Additional X-Bricks can be added to scale out linearly as demand increases.

You might be thinking “DaaS is too big for me.” You can still work with a service provider to see if they offer an EMC DaaS solution that you could purchase space in.

With that said your employees can either be out building a snowman with those who don’t have DaaS or inside being productive. Which is better for your employees, for your bottom line?

*In the interest of full disclosure I’m an EMC employee and I’m heavily involved in this release.

Permanent link to this article: https://www.wondernerd.net/do-you-want-to-build-a-snowman-virtual-or-otherwise/

GPU Virtualization Links (Part 1)

Bookmark and Share

Dilbert.com

Today I start a series of blog posts on GPUs and 3D rendering in VMware Horizon View. I’ll be sharing much of my knowledge on the technical aspects of GPUs. In this post I will be sharing links that will be helpful as you plan and deploy a graphically intensive workload. In future posts I will be sharing how to do sizing calculations, hardware install tips, and many other GPU related items. I hope to put out a post every week or two on this.

If you have questions or there is a topic on GPUs you would like me to touch on please drop me a note or add a comment.

Below are a list of links I use when working with GPUs in VMware. This isn’t a comprehensive list by any means. If you have more to add please be sure to put them in the comments.

The Cards

We’ll start with the GPUs. This is based on the VMware HCL for ESXi hosts 5.5 U2 at the date of this blog post. (http://partnerweb.vmware.com/comp_guide2/search.php?deviceCategory=vsga)

NVIDIA GRID Cards:

 

 

 

 

AMD FirePro:

 

 

The Servers

With the cards addressed here are some links to hardware vendors whose servers have been certified with one or more of these cards. (http://partnerweb.vmware.com/comp_guide2/search.php?deviceCategory=server&details=1&pFeatures=60&page=1&display_interval=50&sortColumn=Partner&sortOrder=Asc&bookmark=1) Due to time constraints I’m only going to list a few of the many possible servers.

Cisco C240 M3/M4

 

Dell

 

 

 

HP

 

VMware

With a majority of the physical addressed here are links to VMware GPU materials.

 

 

I know this isn’t a comprehensive list. If there is a link that you think is helpful please add a comment below with the link.

In my next post I will cover calculating how many users you can get out of a single card. Look for it in the next week or so.

Cheers.

Permanent link to this article: https://www.wondernerd.net/gpu-virtualization-links/

Configuring VMware Virtual Flash in vSphere 5.5

Bookmark and Share

I recently had a question come in at work about configuration limits on VMware Flash Read Cache limits. I thought this is some material the community might like as well as a walk through on how to install and configure it.

Before we get into how to setup VMware Virtual Flash is I think it’s wise to give a quick overview of what it is. VMware Flash Read Cache lets you offload some read IO’s from your storage by caching the data on local flash storage. In other words store you can keep a local copy of heavily read data in cache on the host.

In addition to this you can also leverage the Virtual Flash on an ESXi host as a swap caching location. This means if VM’s are consuming more (oversubscribed) RAM than the ESXi host has, the host can swap memory pages to the SSD drive allowing for improved performance when compared to swapping to spinning disk. This also allows SMB’s get more bang for their buck by leveraging less expensive SSD than higher density higher priced memory modules.

I will look at configuring both of Flash Read Cache and swap caching below. For more information about VMware vSphere Virtual Flash, refer to the VMware vSphere 5.5 documentation (http://pubs.vmware.com/vsphere-55/topic/com.vmware.vsphere.storage.doc/GUID-07ADB946-2337-4642-B660-34212F237E71.html) and the links at the end of this post.

I have my trusty small lab environment (it runs on my laptop) setup so lets walk through the process and look at configuring and the limits of the flash read cache.

 

First off, this is what we will need to setup Virtual Flash:

  • vSphere 5.5 or later
  • ESXi hosts 5.5 or later with:
    • SSD storage
      • With 8 or less SSD drives for Virtual Flash Volume (VFV)
      • Each drive 4TB or less
      • Maxing out at a combined 32TB or less (8 drives X 4TB = 32TB)

Once we have these in place lets configure some Virtual Flash.

Configuring Virtual Flash

With the SSD installed in the ESXi host login to your vCenter web client (this can’t be done through the vSphere executable client). Now we want to navigate to our host we want to configure the read cache on (you can also do this across a cluster, see the vSphere documentation for information). Select the manage tab then go to the settings button. Once there we want to scroll down to the bottom of the list and select the Virtual Flash Resource Management section. We then want to click on the Add Capacity button.

Add_vFlash

The add capacity screen appears listing all SSDs in the ESXi host. Place a check mark to the left of the SSDs you would like to use for the Virtual Flash resources (see red check mark in screen shot below).

When we click the OK button all of the data on the selected SSD disk drives will be erased!

At this point we can click the OK button. If we wanted to we could have added up to 8 devices here and had up to 32TB total flash storage. We are now returned to the settings screen.

AddFlashDialog

The Virtual Flash Resource Management screen is now updated reflecting the amount of Cache we have available.

AvailableCache

At this point we have two directions we can go. We can either configure the Virtual Flash Host Swap Cache or configure the Read Cache on a VM. Because the flash in my system is limited I am going to setup the host swap cache first and use the rest for my VM’s.

Configuring Virtual Flash Host Swap

To do this we want to click on the Virtual Flash Host Swap Cache Configuration on the left side of the ESXi hosts settings button. Now we just click the Edit button.

HostSwapCache

In the edit dialog we can check the Enable virtual flash host swap cache option. Then we can configure how much should be allocated to the host swap cache. In my case I will allocate 8GB. The amount you allocate will vary. You will also notice that you are able to see the min, max, & current values for the Host Swap Cache. When finished we can click the Ok button.

ConfigVFHswap

We now have Virtual Flash Host Swap Cache configured. This means that if we start swapping on our VM’s the swapping can first consume the space we allocated for it before it starts swapping to slower speed storage.

Configuring VMware Flash Read Cache

With the remaining space we can configure a read cache for select VM’s. In my case I will be working with a windows template VM. You will want to apply this to your VM’s on a per VM basis for those VMs that would benefit from a Read Cache and only for those disks that would benefit.

To start, we right click on the VM we want to provide a read cache too and select Edit Settings from the popup menu. Select the disk you want to work with and expand the storage for the disk. You will then see a Read Cache option. If you don’t see it check the VM Hardware Version, it may not be version 10 or later, in which case you will need to upgrade the virtual hardware version.

It’s also worth noting, as you can see in the screen shoots below that this can be done non-disruptively. You can enable read cache while the VM is running.

Enter the amount of Read Cache for the vDisk. In my case I’ll give it 5GB’s. You should choose what’s best for your VM.

VFRC_VM_Basic

You can also click on the advanced button. There you can control the block size of the Read Cache. Note that if you change the reservation here it will change the Read Cache amount on the previous screen.

VFRC_VM_Adv

Sizing of the Read Cache is beyond the scope of this document. The best resource for this at this time is to refer to the “Performance of vSphere Flash Read Cache in VMware vSphere 5.5 white paper” which can be found here: http://www.vmware.com/files/pdf/techpaper/vfrc-perf-vsphere55.pdf

It is also worth noting that you cannot exceed the remaining amount of Virtual Flash resources. This is also known as over subscribing the Virtual Flash. So if you only have 5 GBs of Virtual Flash left you can only use 5GBs amongst all of your VMs.

Once you have set the Read Cache for the vDisk(s) click the OK button. You can repeat this process for any other VM’s as needed.

 

That’s all there is too it.

Enjoy.

Here are some good resources as you configure Virtual Flash on your own.

 

If there is a topic your curious about please let us know.

Tony

 

This blog post has been cross posted on the EMC Community Network (ECN) and can be found here: https://community.emc.com/community/connect/everything_vmware/blog/2014/10/14/configuring-vmware-virtual-flash-in-vsphere-55

Permanent link to this article: https://www.wondernerd.net/configuring-vmware-virtual-flash-in-vsphere-5-5/

Mirage 5.1 Web Client Install Quirk

Bookmark and Share

I thought I would add a quick post on the Mirage 5.1 web client install. I’ve been setting up a lab recently with all of the VMware EUC products and Mirage is one that I’ve been working on.

I went though the install of the Mirage 5.1.0 server and that install went smoothly. No errors or issues I could use the MMC snap-in to manage the mirage deployment without any issues. I then turned my attention to using the web management deployment for my Mirage instance.

Since this lab is just a small deployment I decided to install it on the same box as my Mirage server. It didn’t look all that hard. All I thought I had to do is run the mirage.WebManagement.x64.11204.msi to install it and if there were any config issues it would let me know. Similar to other installs I have done.

It turns out that this is not the case. Most of the time when I’m installing a new version of an application I will have the install guide open so I can look up steps that seem confusing. Well nothing seemed confusing about this. I just ran the MSI and it told me it was installed successfully.

When I try and access the web interface though nothing happened. At first it was this website can not be displayed messages. So I go back to the trusty manual. In the manual it says that you need more components of IIS than just the minimal defaults. In fact there are two groups of things that need installed. (You can see them here: http://pubs.vmware.com/mirage-51/topic/com.vmware.mirage.installation.doc/GUID-63B2D3FE-3DA5-41C9-8A17-06175EC05CFE.html)  Once all of these items have been included in the IIS role and you try to access the web management interface you get an error like the screen shot below.

Mirage WebMngt Error 5.1

 

 

The long and short of the message is Bad Module in “v” and it has a few messages that sort of help. Specifically that ASP.net needs installed. The part that threw me though is if I went into services the ASP.net service was running on the machine.

After some digging and poking I finally figured out what was up with it. ASP.net was not installed, at least not fully. So how can this be fixed? It’s actually really simple.

  • Open up a command prompt on the system you are trying to install the Mirage Web Management on.
  • Enter the following command: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\aspnet_regiis -i
  • OR cd to the path: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\
    • Then issue the command: aspnet_regiis -i

Below is what this process looks like.

Mirage WebMngt ASPnet install

I had been poking around the web for a while and found what I’d been looking for on a stack overflow forum here:  http://stackoverflow.com/questions/13749138/asp-net-4-5-has-not-been-registered-on-the-web-server

Once completed, when you open a connection to the mirage web client it should look something like this.

Mirage WebMngt Console

 

 

And that is how you get around the “Managed Pipeline Error” when installing the Mirage Web Management.

This blog post has been cross blogged on the EMC Communities page as part of my job and can be found here: https://community.emc.com/community/connect/everything_vmware/blog/2014/10/04/mirage-51-web-client-install-quirk

Permanent link to this article: https://www.wondernerd.net/mirage-5-1-web-client-install-quirk/

I shot an arrow into the air… (New Job)

Bookmark and Share

My parting message when I left VCE on May 23rd of this year was based on Longfellow’s poem “The Arrow and the Song.” The jist being that “I shot an arrow into the air, it fell to the earth I knew not where.” With my announcement today I complete the poem, “Long, long afterward, in an oak I found the arrow, still unbroken.”

[pullquote align=”right” textalign=”left” width=”30%”]

The Arrow and the Song
– Henry Wadsworth Longfellow

I shot an arrow into the air,
It fell to earth, I knew not where;
For, so swiftly it flew, the sight
Could not follow it in its flight.

I breathed a song into the air,
It fell to earth, I knew not where;
For who has sight so keen and strong,
That it can follow the flight of a song?

Long, long afterward, in an oak,
I found the arrow, still unbroken;
And the song, from beginning to end,
I found again in the heart of a friend.

[/pullquote]

Today I’m pleased to announce that I am now a Sr. Tech Marketing Manager for EMC, a large oak in the computer industry.

In my new position I will be part of the Federation  (http://www.emcfederation.com/) developing technical material for our internal teams, partners, and clients. Much of this will leverage my knowledge of EUC technologies. I will also get the opportunity to expand my cloud knowledge along with other areas of the Federation.

I’ll be working for @BGracely and @theSANzone. Along with some with great teammates including @VMTyler who just joined the team a few weeks ago.

This is an extraordinary opportunity to grow not only my knowledge but the communities shared knowledge as well. I think that is one of the most exciting things about my new position. I hope with this position I will be able to continue to pay it forward. The community has given me so much and I hope to be able to reciprocate.

I’m looking forward to the exciting times ahead and I want to thank everyone who has offered encouragement while I looked for this new oppertunity.

 

Tony

Permanent link to this article: https://www.wondernerd.net/i-shot-an-arrow-into-the-air/

Backing Up My WordPress Site

Bookmark and Share

Thought I would share a quick post on how I protect this site. For those who don’t know me I started my career working in the backup industry. As such I am a firm believer in the saying “if you don’t have 3 copies of it then it doesn’t exist.” As such I try to keep several copies of wondernerd.net floating around.  How many can vary at any given time but needless to say its a lot.

I’ll start with the first level of protection. Basic backups of my wordpress site. How do it do it? I use the UpdraftPlus (http://updraftplus.com/) to backup my wordpress site. One of my main reasons for choosing this is that UpdraftPlus also gets clean dumps of the sites databases. You can configure it to send a copy to a cloud provider if you want. There is a whole list of them on the updraftplus website. I’m a bit more paranoid than just sending them to a cloud site.

Once a backup completes, I then have a post process task running on my home computer that logs into my website and pulls down the backup to my home network and distributes copies to various locations. I make the connection and get copies of the files with PSCP (the same folks who make putty).

A typical backup methodology might be the following:

  • Keep the latest 3 backups for 3 days (or cycles)
  • Keep the end of week backups for 2 weeks (or cycles)
  • Keep end of month backups indefinitely

I’ve illustrated it in this picture.

Website Backup Diagram

How typical backups might work. 

  1. Backups are run on the website.
  2. Those backups are then pulled to the local system as daily backups.
  3. A copy of the daily backup from the start of the week is moved to the weekly backups location
  4. At the begining of the month a copy of the backup is moved to the First of Month Backup location.

 

 

Here is the basic script that I use to do this:


setlocal enabledelayedexpansion

C:
cd "C:\Website_Backup\"

set MyYear=%Date:~10,4%
set MyMonth=%Date:~4,2%
set MyDay=%Date:~7,2%
set DOW=%Date:~0,3%

REM move the oldest to a temp directory for deletion 
ren "E:\Wondernerd_netBackup\5" "temp"

REM Keep For x days
set KeepFor=3

Set /A PrimeRead=%KeepFor%-1
REM Day counter
set NextDay=%KeepFor%
set /A PrevDay=%KeepFor%-1

FOR /l %%x IN (1,1,%PrimeRead%) DO (
  ren "C:\Website_Backup\!PrevDay!" "!NextDay!"
  set /A PrevDay=!PrevDay!-1
  set /A NextDay=!NextDay!-1
)

md "E:\Wondernerd_netBackup\1"

md "C:\Website_Backup\1"

pscp.exe -v -r -C -batch -sftp -pw PASSWORD USERNAME@YOURWEBSITE 
      .COM:/home/WEBSITE/wp-content/updraft/* C:\Website_Backup\1 >> C: \Website_Backup\1\quicklog.txt

rmdir /Q /S "C:\Website_Backup\temp\"



REM Snag monthly backups and keep indefinitely, Grabs the first day of the month

if %MyDay%==01 (
  md "C:\Website_Backup\MonthEnd\%MyYear%-%MyMonth%-%MyDay%"
)

if %MyDay%==01 (
  xcopy /s "C:\Website_Backup\1" "C:\Website_Backup\MonthEnd\%MyYear%-%MyMonth%-%MyDay%"
)

Z:
set EOW=Sun
REM Sync a copy every Sunday to third storage area 
if %DOW%==%EOW% (
  rmdir /Q /S "Z:\Website_Backup\Temp"
)

if %DOW%==%EOW% (
  ren "Z:\Website_Backup\Second" "Temp"
)

if %DOW%==%EOW% (
  ren "Z:\Website_Backup\First" "Second"
)

if %DOW%==%EOW% (
  md "Z:\Website_Backup\First"
)
if %DOW%==%EOW% (
  xcopy /s "C:\Website_Backup\1" "Z:\Website_Backup\First"
)

if %DOW%==%EOW% (
  rmdir /Q /S "Z:\Website_Backup\Temp"
)

 

This is how the above bat script works.

  • First we make sure to change to the correct base directory on my local system.
  • Next we break the date out into a useable form we can use for keeping backups for a given time.
  • Next we shuffle our directories so we can age off the oldest backup at the end of the process. This script keeps 3 backups. You can change this by changing the KeepFor variable and creating the needed folders.
  • Now we create our latest directory where the backups will be placed.
  • Once we have that we are ready to go get the files. For this task I am using PSCP to securely download the files from my web server.
    • Note that in the above script it appears as three lines with a “↵” at the end of the first two lines. The pscp command is actually a single line.
    • The parameters should be fairly obvious in the PSCP command. At least the ones in capital letters should be.
    • I also output the download process to a file so I can have a quick list of files backed up along with any errors that occur during the backup
    • [warning] Your password is stored in clear text make sure to take appropriate precautions with this script.[/warning]
  • Once the files are downloaded I then get rid of the temp directory that contains the oldest backup in the set.
  • Now we get to the long term protection. If its the first of the month (or what ever day you want to set it for) we copy the latest backup to a date identified long term backup directory.
  • The last 6 IF statements copy a weekly copy of the backup to a designated folder if its a Sunday. I would shorten it down but if statements in bat scripts don’t do well with multiple lines of code.
  • I put the above bat script in a directory with PSCP and all of the other directories (except the temp directories) per-created.
  • Now all that needs done is to schedule the bat script to run.

And that’s it. I now have lots of copies of my website with relatively little effort and since its more than 3 copies… it really does exist!

If you’ve made it this far you are probably thinking this is great how do I implement this. That’s fairly simple.

  1. Install and configure UpdraftPlus for your wordpress site.
  2. Create your backup directory.
  3. In the backup directory place:
    1. A copy of PSCP (unless you’ve changed the path in the script)
    2. All of the backup sub directories (1-3, First, Second) the rest should be automatically created. NOTE: If you plan to place weekly backups on a different storage location you will need to create a separate directory for First & Second directories.
  4. Create a bat file
    1. Copy the above code into a text document.
    2. Make any changes you want such as directories that you are saving to and how many backup sets you want to keep on your local system.
    3. Save the file with a .bat extension into the backup directory.
  5. Create a Windows schedule to run the bat file created in step 2.

That’s all there is to it. You now have a fairly complex backup system for your website. And if you’re like me you can place copies across several different sets of storage so you are protected locally and remotely.

If you have any feedback to share please do. It would be great to continue to improve this script. Hopefully this has been helpful.

 

Permanent link to this article: https://www.wondernerd.net/backing-up-my-wordpress-site/