Podcast: Play in new window | Download
Subscribe: RSS
Today’s post/Podcast episode topic comes from a community member Jeff, who recently lost all of the data from his site hosted at BlueHost, even after paying extra each month for them to take backups of his site.
I mentioned that as one of the primary drivers for switching my hosting recommendation back over to Webhosting Hub a few weeks back.
In today’s Podcast episode I talk in detail about how you can follow the steps Jeff provided below, to recover your website data, in the event you don’t happen to have a good backup of your website.
Thanks to Jeff for the detailed write-up, here it is….
My Niche Site Went Down and The Backups Were Bad, And It Could Happen to You Too!
You Don’t Have to Be Technical to Handle A Data Loss Situation
If you can do these basic things, you can recover the bulk of your your website, even if your web hosting provider fails and you don’t have backups of your own to turn to.
All you need to be able to do to get most of your website back is:
- Have a basic understanding of WordPress. Enough to create your own posts or install a plugin.
- Be able to search Google for your site by web address.
- Copy and paste text from Google text editor such as Notepad and save the files to your computer or out to the cloud.
- Once you have a working WordPress install, paste the saved content back into WordPress one post or page at a time.
I personally had to go through this process myself not long ago. I want to share my story and the steps I took to recover, in order to help anyone else out who happens to find themselves in a similar position.
The Back Story In a Nutshell
About three months ago, I noticed my primary niche site had some unexpected downtime. I reached out to my hosting provider and found that they had a server failure — stuff happens — four hours later it was back online.
I felt secure in the fact that my hosting provider makes backups of my site and if something should happen I could recover all of my content.
Fast forward to about two weeks ago, when my site went down again. This time it was much worse. My site content, all of my site content, was just gone. I asked my hosting provider to kindly restore my site from backup; they agreed and contacted me an hour later…
The backups were all bad. My site was just gone. I was angry and sad. I’d worked so hard on my new niche site and thought I was safe against this sort of thing. Needless to say, this was an extremely unwelcome reality check.
I Got (Most) Of My Site Back and Here Is How…
Google has some interesting tricks under the hood. One of them is the ability to look for content only on a specific site.
If you Google for site:YourDomain
, you’ll get a list of every page Google has ever visited that belongs to your website.
I’ll show you. My niche site is defeatdebtcollectors.com and its main focus — as you might have guessed from the name — is now to get debt collectors off your back and fix your credit.
If I want to see everything Google has for my site, I’d just type this into the Google search box:
site:defeatdebtcollectors.com
The Site’s Down; How Do I Get My Content Back?
If you find yourself in a similar situation, all is not lost. Here is what you do to get the bulk of your content back more or less the same way I did.
Every time Google visits your site, it makes a copy of the page it visits and saves the text and other content in its own database. It has to do this, or searching the Internet would take hours or days instead of a few seconds here or there.
So, for our purposes, Google’s cached copy of your site is the closest thing to a backup you’re ever going to get.
Once you have the cached content on screen, you have two basic options: full content or the text only option. Unless your site is especially image heavy, you only need the raw text. Also, in some cases the “full version” will just show a blank screen and never load — you’ll be forced to use the text only version.
Delving Into the HTML Painlessly
At this point you need to look at the HTML of your site by hitting CTRL + U on your keyboard. Don’t worry, this won’t hurt, and you don’t have to understand what you’re looking at.
Once you’re looking at the HTML “code” that makes up your website all you have to do is look it over until you see text the text of your post. Once you have that, look for the first <p>
paragraph tag. Highlight all the way down until you reach the end of your content. There should be a closing paragraph tag that looks like this: </p>
.
Hit CTRL + C to copy that text and then open up whatever text editor you’re using and hit CTRL + V to paste the text into the text editor. Now you can save it, you’re all set.
Do that for each post you can find in Google’s cache.
What if I’m in Over My Head with HTML?
If you find working with HTML just too intimidating to manage, don’t worry there is an alternate solution. You can skip viewing the source and just copy the text as it appears in Google’s cache. Then, save it to a text editor. Once you have it saved you can use Markdown to preserve most of the formatting. Markdown is a very simple text formatting tool that WordPress supports naively.
Saving The Content with a Text Editor
If you’ve gotten this far and cached copies of your site are loading from Google’s “memory banks,” this is a good sign. It is highly likely you will be able to get most of your content back using this method.
You need somewhere to save the content though, and Microsoft Word or Google Docs is not the best place to save the content, for technical reasons I don’t want to bore you with.
You’ll need a plain text editor. Just about any plain text editor since the beginning of time should work; here are some options:
- Microsoft’s Notepad. It’s on every version of Windows ever.
- Notepad++ is a more advanced text editor along the same lines.
- If you have an Evernote account, it will work too, as long as you don’t use any of the formatting options.
Some Extra Steps for Sanity
If you have a lot of posts, keeping track of all of the content you’re scraping off of Google and dumping to file can be rather taxing. It’s really easy to get confused, lose track of your progress, or save the same file more than once under different names.
I took two steps to minimize the chances I’d make a mistake and maximize the chance I’d get to keep my sanity.
Put the Blog Post Title In The First Line of The File. This made it easy for me to know what post I was looking at when I opened the file.
Keep Track of Post Links in a Simple Spreadsheet. A few minutes in Google Spreadsheets was a life saver. I still managed to misname a post or two. But this went a long way toward minimizing human error.
I’m Going To Assume WordPress Is Back Up At This Point But Your Site is Still Empty
At this point you can just create a new post and name it after a post you’re recovering. Give it the exact same name as before so that it’s most likely to have the same page name.
In the WordPress editor you have two options the Visual editor and the Text/HTML editor. Just go to the Text tab and then paste the contents into the new post. Save it and preview it.
Take a close look at the preview, if everything looks good you can publish it. If not, chances are you can fix it quickly with the Visual editor.
What If I’m Recovering with Markdown?
If you’re using the Markdown method, you need one extra plugin installed before you’re ready to go. Don’t worry, they’re really simple to install, just point and click.
Jetpack. This is a plugin that handles a ton of stuff, but we’re only really interested in one feature: enabling Markdown support. This will greatly speed up recovering your content.
With the plugin installed, head on over to the configuration options and make sure the Markdown option is checked.
Taking Stock of Your Situation
At this point, you should have several things going for you:
- You should have about as much content as you can “mine” out of Google’s cache.
- You should have one text file per post, formatted in Markdown.
- You should have a spreadsheet with the name and actual link to your article in it.
- WordPress should be up, with a very preliminary setup but empty of content.
- The Jetpack plugin should be installed, activated, and the Markdown feature should be enabled.
If you have this much done, congratulations! Most of the “hard” work is done.
Putting The Content Back In Its Place
Putting Humpty Dumpty back together again is as simple as creating new posts and pages with the exact same names as before.
Create a new post — reference your spreadsheet as needed to keep your sanity — and paste the Markdown formatted text into the post and click Save as Draft.
Once you’ve put all of your pages and posts back onto your site, go back and preview each one. Make sure to do a through sanity check: ensure that outbound links work correctly, and that nothing obvious is missing, and nothing unexpected has happened.
At this point you should be ready to go: open each page and post and click the Publish button. If you haven’t already, now is a good time to set your preferred WordPress theme.
You’re not done, but you’ve accomplished a lot by this point. Take a break and clear your head — you’ve earned it. Your site is back up; all that’s left is a bit of cleanup to contain the damage.
Damage Control
No matter how careful you have been up to this point, it’s absolutely inevitable that you have few broken links in the mix. Don’t feel bad, it’s just part of reality and not to difficult to fix.
Where are These Broken Links Coming From?
In at least a small number of cases, you’ve probably wound up with a few misnamed articles. Don’t sweat it, you can fix it just by changing the post name (not the title) back to whatever it was before. It’s a simple fix.
Other edge cases will show up too, most notably tags and the dates of certain posts. You’ll likely have to “wing it” and add tags back to posts piece-meal.
As for post dates, don’t let it bug you. If you managed to capture the original post date, feel free to set that correctly. If you didn’t capture it or it’s too much work, don’t let it bother you, the change will have very little impact on your page’s ranking in Google.
You might want to checkout a Wordperss plugin called Forty Four which captures bad backlinks and logs them for you so that you can fix it. You may want to run this plugin for a few weeks after the transition to the new site. You can get rid of it once the dust has settled.
What About My Images?!
If you were very fortunate, Google cached both your original website’s text and the images. If that is the case, you can simply right-click the cached images and save them. If you’re not so lucky, you have one final option: Archive.org – The Wayback Machine
What the heck is that? Remember the DeLorean from Back to the Future? It’s sort of like that. The Wayback Machine lets you look at an old copy of a webpage days, weeks, months, or years in the past. Really, the whole thing, often including files too.
If your site was very image heavy and you weren’t lucky enough to recover the images from Google’s cache, this is your next best option. There is a catch: the Wayback Machine only indexes a site periodically, think once or twice a month — sometimes less.
That’s a lot less often than Google. So you may have a fairly large gap between the last time the site was indexed by the Wayback Machine and the the present. Translation: you probably aren’t going to be able to get all of your images back, but it’s a whole lot better than nothing.
Preventing Future Issues
At this point, it’s highly likely that your site is about as fixed as it’s going to get. I’m sure you’re wondering, “how on earth do I prevent this from happening again?” Quite simply, you make your own backups.
You don’t have to be very technical at all. Like a lot of things in WordPress, there is a plugin for that.
You can, of course, use whatever you’d like, but one of my favorite options is UpdraftPlus WordPress Backup. There is a free version, it’s easy to install, even the paid version is very inexpensive, and it supports sending my backups directly from my website to Dropbox. I don’t even have to think about it. I set it once and forget it.
If the site ever goes down again unrecoverably, I just deploy a new WordPress install, put the Updraft Plugin back on, and tell it to recover from one of my backups on Dropbox. It’s very hard to get a more elegant recovery method than that.
Wear Belt and Suspenders Both
Everybody makes mistakes. It’s really easy to trust the technology involved in keeping a website up and running. We’re all used to being able to click an icon or run a command and have exactly what we want. Unfortunately, that’s not reality. Technology is more delicate than most people realize and the unexpected can happen.
Am I mad at my former hosting provider for dropping the ball? Yes, I sure am. I wish they would have addressed the issue properly and gotten me my data back. Failing that, I wish they would have owned up to the problem and helped me recover what content I could for my site. If they’d have done that much — and not had anymore outages — they probably could have kept me as a customer. But they didn’t. I followed the recommendation of Chris and switched over to Webhosting Hub.
It’s my niche site, with my data, and I’m responsible for it — not my hosting provider. It’s in my best interest to “wear belt and suspenders both”, as my grandfather used to say, and keep my own backups too. If I had them when this went down, I could have recovered my site to a different server and been up in 1–2 hours, not the several day outage I actually took.
Conclusion
Thanks again Jeff for the great write-up. The moral of the story is be sure to take responsibility for your own website backups using a tool like UpdraftPlus WordPress Backup and store them offsite in a location like Dropbox. That is your best bet for being able to recover your website from a catastrophic failure.
If it is too late and you’ve already lost data, you can follow the steps above to attempt to recover your lost data via Google cached pages and if all else fails, try Archive.org – The Wayback Machine.
One positive note, after switching over to Webhosting Hub and getting his site back up and running, Jeff noticed an increase in his site speed vs Bluehost. If you’ve experienced problems with your existing web host, definitely consider giving Webhosting Hub a try.
They take care of transferring over your websites from other companies like Bluehost for free with zero-downtime. In many cases, your site will run quicker on their brand new SSD hard drives. You can also save a significant amount of money by going through my discount link, specifically for customers of the site.
Hopefully you never experience a complete data loss like Jeff, but if you do, the steps above can help greatly to get your data back and get your website back up and running.
If you have any questions for Jeff or myself, feel free to leave a comment below or jump on over to the Private Facebook Mastermind group. You’ll most likely get a quicker response from Jeff over there.
Thanks and best of luck!
_____
Some of the links above may be affiliate links. If you decide to make a purchase, I would receive a commission, at no additional cost to you. If you do make a purchase, I sincerely thank you ahead of time.