WriterAccess Webinar Archive
Think Like a Search Engine and Win
Thursday, September 29, 2011 – 1:00 PM ET
To win the war of words on the web, you need to know the rules. Or in this case, you need to know the Google algorithms that drive search engine listing positions. These "magical" formulas push down the bad content, yield way to the rich content, and offer the relevant information Google users are seeking.
Join Scott Stouffer, Co-Founder and President, Technology at The SEO Engine, and Byron White, Founder of ideaLaunch, for this month's Content Marketing Webinar to learn how to:
- Pinpoint SEO problems on your site
- Fix identified SEO problems on your site
- Think like a spider bot
- Sting like an SEO wizard on steroids
- Put advanced SEO technology to work
- Decrease time spent on SEO
- Increase the impact of your SEO efforts
You'll learn how the SEO Engine—the world's first "transparent" search engine—reveals what Google is seeing when it crawls, scores, and ranks any site. You'll see how each page and link on the web is scored in "real time," and how you can use these insights to drill down and fix the issues that are affecting your rankings.
The slidedeck from this webinar is available for download.
Byron: I have a guest with us today, Scott Stouffer, Co-Founder and President of Technology for The SEO Engine. Scott, welcome!
Scott: Thanks, Byron!
Byron: So, we're very excited about chatting with you about this concept of thinking like a search engine and with what I hope will be the ability to "sting like a bee" when you leave this presentation. Let me go over a couple of logistics with you and then we'll chime in and go right at it here. For starters, we would love for you to ask questions throughout the presentation. We'll be monitoring those questions that come in and while Scott is speaking I'll actually be manning any questions you may have as well about Scott's presentation and try to get back to you with a quick answer as I always like to do so I'm communicating directly with you and sharing the answers with everyone else to the questions you might have -- that really helps us.
Second of all, if you want to send any "Tweet" love to us, some direct messages would be really great, something I really appreciate from anybody that's listening in. I also want to throw out an offer to get some thoughts from you on the webinar, perhaps a testimonial saying, "Byron, these webinars are fantastic, we love them, it helps me with my business, thanks so much." And in return for a testimonial, I will happily send you a $25 gift certificate at WriterAccess, our new model that we launched, I'll give you a coupon code that you can use to get a $25 credit over there, that's a super-exciting model that we have, so I appreciate that feedback. We're actually going to try and do some work in the next few months on getting together testimonials for these webinars and really pushing them to a wider audience to get the word out. So any help you can offer there would be really great -- just send it to firstname.lastname@example.org.
Without further ado, let's dive into this topic -- let's kind of walk you through what our goals and objectives are today and spin you through things. For starters, I'm going to give you my brief, quick snapshot of the content marketing revolution, and if we have quite a few new listeners in to the webinar today, it's important that everybody gets on the same page with what the revolution is about and what our goals are in the fight, and how it all works. Next, I've put together some really cool tips that I think will get you understanding how the search engines think and give you some really quick takeaways that you can put to work immediately upon review of my part of the presentation. And finally, we'll jump on the SEO Engine with Scott and learn about his incredible technology platform that I've personally seen myself, and we'll dive into a close up look at how SEO Engine can really become the engine of choice for you to get some interesting analysis done on how your site's performing and how the search engines are seeing it.
So, without further ado, the content marketing revolution: what is it? It is certainly defined as the art of listening to customers' wants and needs, and that's probably the biggest opportunity that we have in the marketplace right now, is to better listen to what people want. We're doing a pretty good job of that, in tracking social media you search box and looking at analytics, but I'm convinced there's a better way to understand customer wants and needs and I think we're going to see some real advancement there. It's certainly the science of delivering content to them that's compelling, and you need to do that with a diverse portfolio of assets these days. It's no longer just a blog or white papers; it's really a diverse series of assets you need to generate. And the key is of course to catch readers that are orbiting high speeds that are without a doubt off your website. It's not just about publishing content. Your website, you need to think beyond and really look at a richer way to connect with people, and the key of course of this all is to publish and create information that people want and need. And these are just buzzwords here that help you understand the actual words that I'm suggesting that we start including in content we create -- words like help, advice, insight/excite, innovation, rationalization, love, happiness -- we need to get in tune with the information people want and need -- and let's face it -- everybody wants help and advice and guidance.
It's testing campaigns to see what works and really learning a true scientific process with lots of different choices that you now have to sort out how to find the greatest path to engagement, which is finding the most efficient path and that's used in social engagement and qualifying leads and looking at those leads as they move down through the funnel.
So that's pretty much a summary, which can perhaps be best explained, which is what we believe is a six-step process. Content marketing really is a very methodical approach to looking at your company's marketing efforts and what you're publishing, and it starts with planning service, as you can see, and doing a lot of things with competitive intelligence. For those of you that have been on this webinar series, you know that I've spent a lot of time talking about all the steps to this process. I'll be speaking at DMA next week, and walking people through this workflow and Conversion Conference in New York a week after that. DMA is actually in Boston next week, which is great for us -- I don't have to travel to speak at a conference, but it's really exciting what's happening out there. This workflow is making its way into a lot of companies hands and becoming an important part. We believe it starts with planning but it ends with performance, and that's what we're trying to educate people on.
So, let's get into it -- how do I think like a search engine and sting like a bee? Let's review some tips that I've prepared for you here, the first of which I really spoke about: you really need to diversify your asset portfolio, and this is an example of how in the planning process, we really take a close look at how many assets do you have vs. your competitors. And we can certainly look at the total assets vs. the competitors, but we can also look at the individual assets and how diverse your portfolio is. Looking at web pages or white papers, case studies, success stories, news articles -- you really need a diverse portfolio of assets, and it is my professional opinion that spider bots and the search engines are looking at these diverse assets and scoring your website and your likelihood to own a greater percentage of the market share, which I'll talk about a little bit later, based upon the diversity of your assets.
We can, of course, look at social publishing frequency -- how social are you? How do you stack up with the competition? What is the volume that you're publishing? You can also look at your fan base in the social sphere -- and certainly, I believe the search engines are looking more and more closely at that. There have been some great articles recently that I'm going to try and summarize on our blog and get links to you, but it's becoming increasingly relevant to your overall success, and we're seeing some high correlation with increased traffic to your website in listing traditions of the search engines, directly related to the growth of your fan base and the frequency that you're publishing in the social sphere. Re-Tweets of people connecting with your content, and connecting with you -- this is all becoming relevant and important.
Another thing to look at is your organic market share, and this is a snapshot of what is what we call an organic market report, and what we do with customers is we try to break out keyword silos that we want them to focus on for content creation and optimization and even distribution, and those silos all have names on them, and they feature groups of keywords that are all mutually exclusive to the other silos, so we're focusing our effort. Then when we see, for all of these keywords in here, let's see if we're ahead of the competition in the search engines, or we're behind the competition, looking at the competitors as a whole, and we can then ascertain that we have a percentage of the organic market share for each of the individual silos. So we then have a base camp to say, "OK, our goal and objective with content marketing is to write more and publish more and make it great content, and then measure and track how we're doing vs. the competitors for each of these individual keyword silos.
But don't think for a second that Google isn't looking at that as well. Of this group of keywords that all of these competitors are listed for, who's on top? Who's not on top? Who's rising in terms of publishing more content? Who deserves to have a top listing over someone else? This is what an algorithmic team of individuals is working on at Google every day, and this data, these stats are absolutely relevant to thinking the way they do. You can also look at market share in terms of pay-per-click dollar value. My pal Mike Roberts over at Spy Fu in the recon reports actually assesses the dollar value of a keyword; if you were to get a number one listing on Google, what would that keyword be worth? And it's worth looking at the search volume and how frequently that word is searched, multiplied times the pay-per-click price, it begins to give you some idea of the dollar value of the keyword. Likewise, I think the search engines are looking closely at that, saying, we can't give all of the dollar value and the market share to one company; that's not going to get the diversity that our readers and searchers need at Google, so they've got to diversify the organic share of the market share, the keywords, looking at them in a couple of ways -- both organically and the pay-per-click.
Next, certainly link popularity, probably the most talked-about element to thinking about how the search engines work -- but wait a second here -- maybe "Likes" are the new links? There's a lot of people in this industry that believe that "Likes" could be slowly creeping up on Google's interest in valuing a website and a web property, and their Facebook page, their corporate page, how many people like that page, what the value is in that sphere, so popularity certainly will be something to look at -- how do you stack up vs. your competition, but I think we need to look beyond that as we move forward.
Number five is link building. I really believe that Google is looking very closely at the link structure within your own website. After all, that's what you control. It's a testament to your savvy with regards to building links. Here's an example of the page that has clear spamming link strategy -- too many words -- look at "content management" in the center paragraph there is actually linked to the same page twice. I mean, enough is enough, and Google is clearly picking up on that -- way too dense.
But here's some "secret sauce" that I like explaining to people with regards to building links. It's a simple philosophy; it's one I've been practicing with our team here for many, many years, as we try to build links with best practice. Imagine if in this sentence, the word "big beef" linked over to Omaha steaks: "I had a big beef with my boss the other day. The steak was cooked to perfection and melted in my mouth." Google would not know what "big beef" means unless they looked at the keywords and the words around that phrase -- and that, in my opinion, is precisely what's going on. Every time you craft a link, it's not just the link itself, it's the words around the link. Is it a contextually relevant link to the content on one page vs. the content on that page you're linking from? That's the "secret sauce" of link building; you can start practicing it today, and now, as you build your links out.
Next, we've seen a lot of work. We've talked about a million keywords over at Word Vision, this technology publishing platform that I'll talk about later that's actually free -- you can get a free trial of it and try it out -- and we find that a lot of our customers are constantly changing and moving around their listing positions. Clearly, Google is unhappy with the stable number one ranking for that particular domain name, or a stable number two.
So we find that there are trends that you need to pick up on when you're looking for a competitor nature. What's interesting is, are you competitors constantly changing around those listing positions? Is Google looking closely at how their user-base, their searchers respond to your website by putting it in different positions. So there are a lot of things you need to really look at in regards to analysis of how your site's performing and how search engines are looking at it.
Next is optimization strength. Inside Word Vision, which again, you can get a free trial of, you can score content for SEO strength; it runs a little keyword density analysis and actually puts a score on the page, and you can actually select a particular keyword silo that you want a score for. But we have another tool, which is a free tool, you can actually type in www.pagestrengthtool.com and it will go to a page on IdeaLauch that's our page creator, and you can just type in a phrase that you want to see how that page is optimized for, type in the domain name, and it will immediately print out this Grade B formula that you see below here, where we're going, it will tell you, is that phrase used at the title tag, meta description, the meta keywords, how many times does that word appear on the page. What is the link popularity to that individual page, and what's the latest position at Google and Yahoo and Bing for that particular keyword phrase.
So, this is basically a quick snapshot of what a spider sees on a page and I think it's a good representation of trying to get inside the mind of a search engine and see what they see on the page. I think Scott's going to bring us about 400 levels above that, but it's a nice, free, quick tool that lets you just see what's going on.
So content quality is the big buzz these days. Believe it or not, we're thankful that Google developed Panda and rolled it out across the board. We, of course, feel badly about many sites that were hurt by it, but clearly, you need to be thinking hard about quality content these days. And it's not easy to create great content, but I do believe it's easy to edit content and to make it better. Here's an example of a paragraph that you're going to get if you choose to download this deck, and it's a fairly well-written article, but take a look at what one of our editors do when they go in and edit content. There's a lot of red on this' we're changing things and rephrasing things and leaving the client, and I would argue, the search engines and readers with a better content asset. It reads better, it's tighter, it explains the value proposition in a better way. Editing, in my opinion, is the "secret sauce" that we're going to see moving forward, and I believe that Google certainly has linguists on their staff with PhDs that are looking closely at syntax and words and phrases and the ability to really analyze content and determine its value.
Distribution channels, is of course, relevant these days, and putting that all together, and into a format where it can be analyzed, and you need to be thinking way beyond obviously your blog, and getting your content out to the right place at the right time, including, I always love arguing, printing a physical book. Your interest in this webinar was certainly the content, but getting a free 101 marketing tips was I'm sure of value to you. That's our "secret sauce" with what we've used for the last couple years to attract people to listen in and hear what we have to say. But it's time for you to publish a book and we're actually working on some really creative things.
Over at LifeTips, we've published about 70 books over there, and we're re-rolling out that book publishing program to make it super-simple for companies and executives to get a book published and produced, with help and support, editorial support as well as an audience to distribute the content to. So, get the words out the traffic in, that's really the message of distribution, and without a doubt, search engines are seeing that, so publishing depth and frequency, as we just plowed through these last few slides, notice that on March 10, we doubled the number of assets that were published on this particular website. And guess what? Come March, April and May, the traffic not only improved, which you'll see on the second slide, but the overall listing positions improved, which of course caused traffic to improve, so spike your content -- we call it "front-loading" content, where we go out, we basically are training the spider bots to come into the site and see a whole bunch of new content which makes them say, "Wow, lots of new content -- fantastic -- I'll come back to this more frequently and look for more content." So you're literally training the bots to come back to your site to look to this new stream of content that you're publishing. And you need to stay steady with that; if you stop publishing, you'll see what happens -- your traffic will drop. This is not rocket science over here, but it's interesting to just see it in a presentation like this and just have a wake-up call.
Number nine is publishing frequency and depth. So, what does your site map look like? These are all landing pages on a particular website, and how much depth do you see in each of these pages? How much information is behind a particular page? And that, I think, Google is paying a lot more attention to these days. You need depth. You need deeper pages within your website that are supporting the information that people want and need about that particular page. Of course, finally, time on site, a little trick of the trade for you here, is what Google is really seeing is not just the time on your site -- obviously, they can't see that, but what they're seeing is, someone does a search at Google for, say, content marketing. They find the IdeaLaunch website. Google can legally look at that same person coming back to Google. They have their IP address, they know they're coming back. The question is, Google can rightfully see how much time they spend on your site. There are not any regulations that prevent Google from seeing that, and that's probably, in my argument, the single most important and overlooked element on your website is the time on-site, and how much time people are spending on your site. And I would argue that if you could increase that time on site by having better assets and better content on your site, and stickier content, or tools or resources or downloads or white papers, and things that people want, and people are spending more time on your site, you will, at the end of the day, improve your listing positions and increase the traffic on the site and capture market share.
So, in final conclusion, "The only content marketing left is content marketing." My favorite quote from Seth Godin, which inspired me to really start IdeaLaunch and LifeTips and Word Vision and all these other services that we've rolled out in the last 14 years at IdeaLaunch. So thanks for listening in everyone. I'm going to turn the presentation over to Scott. Take it away, Scott!
Scott. Okay, thanks Byron. Welcome, everybody. I'd like to again, thank Byron for inviting me to his webinar series. I definitely have a lot of information to show you so we'll get to it very quickly here.
Today, basically what I'm going to do is I'm going to introduce you to what the SEO Engine is. Obviously, there's a ton of information from a content marketing standpoint. It can help you dramatically improve the result of your work. So as you work with IdeaLaunch and as you work with writing content, there's a lot of different pieces of information flying out there, and one of the cool things about the SEO Engine is that it shows you from a search engine's perspective what's going on.
So, what is the SEO Engine? Essentially, what we direct here is a "transparent" search engine. If you look at our infrastructure, it looks just like that of Google, albeit a lot smaller, and it essentially mimics what a search engine does. We have our own internal index that we constantly try to improve, moving good content up and bad content down, and what we've done is developed a series of screens that you put on top of that search engine. You can think of it as like a window on a search engine, and seeing in real time what's going on inside of a search engine like Google; how every page is being scored, how every link is being scored. It also gives you a lot of different control points, so you can actually take control of the SEO Engine, and go out and actually demand that it goes and crawls and re-scores a given site. So that gives you an ability to make changes and then verify before the Google bots come back to your site that you have indeed made the proper changes.
Some of the things that you can do with this -- there's a lot of advanced things you can do with this. Obviously, the first thing is you can start thinking like a search engine. The ability to actually view the internet space from a search engine's perspective is a very complicated matter. There's a lot of algorithms that go into it; millions and millions of calculations just one one specific website, but from an SEO Engine standpoint, it kind of rolls everything up and prioritizes things. It takes all those complex algorithms and kind of produces a very simple output for you to digest. Another way you can use the SEO Engine is you can automate your SEO process. We allow you to schedule recalls. You can actually set up rules and actually have a set of rules that get applied and every time the SEO engine goes out and recalls a specific site and you get notified if that rule is triggered. The big thing on SEO engine is that it decreases the amount of time to identify all the penalties and errors and go and fix those issues so you can actually drill into any specific screen inside the SEO engine and see exactly what's going on.
So, what I'm going to do is kind of go over what search engines like. I know Byron did a great job of going through an entire list of things that you can do to think like a search engine. Essentially, all these things, all these signals that are being enveloped into Google are extremely important. There's millions of different signals that are put into query results when you type in a search on Google, or any search engine for that matter. Lately, obviously, we've been focused on the Panda update and a lot of the issues that arise from the Panda update.
So what I want to go over is essentially, first of all, what Google says. Now, Google, basically, on their official blog is saying that their promoted content, they want to provide better rankings for high-quality sites with original content and information such as research, in depth reports and thoughtful analysis. Yadda, yadda, yadda, right? So, these are very high, generalized platitudes that are typical from the Google blog and usually cause more confusion than certainty among internet marketers. Now, you as a person who's writing content -- you obviously know all this information already. So, what exactly is the Panda update? What's going on here? So for those of you who don't know yet, Panda essentially is synonymous with machine learning. Essentially, what Google has done with the Panda update is they've taken a series of algorithms or a branch of artificial intelligence called "machine learning" and essentially what machine learning is, is to attempt to actually evolve search engine algorithms on their own. What does that mean? Well, essentially, what they're doing is they have a manual process, or a series of manual inputs, like social inputs, the "+1" would be an example. Another example would be hiring a team of editors to go over and say "I like this" or I don't like this." "I would put my credit card on this site," or "I wouldn't put my credit card on this site," and so forth. And a number of series of signals get fed back in, and this feedback loop in an attempt to actually evolve the algorithms and make them more precise on how to match good and bad content within Google.
So, the idea is, how do you match new content with known content? And so Panda is this series of algorithms that have attempted to start to do that, essentially. So they know what good content is, they know what bad content is, and then their goal really is to match your content, or the new content they discover, with one of those buckets of content, whether it's good or bad or somewhere in between.
So, obviously, one of the most important issues that you need to know is, what is "good" content? And you can see I put "good" in quotes because that means "good" in terms of a search engine. Obviously, looking at good content is a very subjective viewpoint from a human standpoint, but from a search engine standpoint, it still remains a very objective view. It's based off a set of finite algorithms, albeit those algorithms are now evolving slowly through some of this machine learning, but they have very core principles. If you understand those principles and you can see the search engine, then it makes your life a lot easier because you can actually understand how the search engine is thinking, not from a human standpoint, but from an algorithmic standpoint or a machine standpoint.
So, the goal really is -- the quick and dirty rule is just to simply reverse-engineer what you know to be highly-rated content. So, one of the cool things about the SEO engine is it allows you actually go in and reverse-engineer a site, so you can put in a competitor's site, or a site that you know you want to aspire to be like, and you can go in and see what the link flow distribution is, what the market focus is for each page, how the content is, how the duplicate content correlation levels are inside their site, all kinds of different signals and factors can actually go into the scoring of every page and every link inside of a site. And then you can try to mimic that, and knowing that Panda is most likely group your new content into a bucket of content that's good, if you follow the same semantics that a highly-rated site is actually using.
So what I'm going to do is actually take you through some of the SEO Engine screens today. I'm going to try to keep things very simple. They can get quite complex; you can dive into as minute detail as you want, and they have all kinds of different uses. The SEO Engine can really be viewed as a Swiss army knife, whether you're a content writer, you have a series of screens you might want to focus on. If you're a link builder, you have a different set of screens that you're going to be interested in. It really just depends on what your purpose is, and today what I'm going to do is I'm going to take you through some of the content-specific related screens and how they can help you decipher whether or not the content you're writing is good or bad.
So the first screen is called the website search screen, and this is the most easy to understand screen. It's the most applicable to a search engine. It looks just like a real keyword search listing screen except it's for a specific site. I'll take you right over to that right now. You can see right here I've typed in the test drive site, venicechamber.com, and you're seeing a search listing of all the pages in the site. And obviously you have the ability to type in a keyword like "Venice" and all of the pages inside of this site with its search results. Now, there's a couple differences here though that you'll begin to notice immediately between the active engine results transparent search engine and a search engine like Google, Bing or Yahoo.
One of those is you actually can see the queries score, so you can see how relevant the results are to the query that you just typed in. And then you can also see two bottom rows right here, one of which is "market focus." Market focus is what the search engine thinks the page is about, we're showing the top market focus here and no prepositions are in any of the market focuses. Think of them as like "broad match phrases." The order is not as important. It's really the grouping of words and meaning. And there's actually a basket of market focuses and we'll get into that when we get into the market focus detail screen and the second row here which is net total link flow. Link flow is our version of "page rank." The very finely-grained version of page rank. We don't have the logarithmic 0 to 10 scale, rather we have an actual absolute value -- the value behind page rank.
So you can actually see and compare any other page using that value as well as any link to any other link and it's a much more accurate way to look at the underlying potential of each document. The net total link flow essentially is the gross total link flow times the SEO Engine score and that essentially defines what the net total link flow is, and we'll get into that in a little bit as well. And then finally the SEO Engine score which every page has, every document has its own SEO engine score, and this is basically an efficiency ranking. Think of it as, "How well am I using the potential ranking power for a given document?"
So that's the website search screen. The next screen is called the web page scorecard. And the scorecard will show you actually how the search engine is penalizing a specific page. It'll show you the actual calculation of the SEO Engine score, or the efficiency score. It will show you the difference between gross and net total link flow, and it will show you all the link penalties, and it will give you the links to every specific screen inside the SEO Engine that's specific to that penalty. So you can actually identify the penalties, and then you can go and fix it very quickly by navigating to that specific screen. So let me take you through an example of that. And so if we click on any one of these net total link flow or SEO Engine score we'll get the scorecard.
So let's scroll down here and see exactly -- actually, one thing I'm going to do before the scorecard is identify the difference between the net total link flow. These values right here essentially separate documents on the internet. So if somebody types in the words "green frogs" in Google, you're going to get millions of pages that are matching "green frogs." All the similar relevancies, all the millions of signals that are calculated are similar, and so the way that a search engine has to separate the wheat from the chaff is to essentially link the documents underlying those similar documents. And that ranking scale is what we call net total link flow. And so you'll see here on the net total results that are very similar. Let's see -- here's two documents that are very similar in query score. In other words, they're in the same relevancy score for a given search result. This is ranked over this because it's simply the net total link flow. And this is actually sort of a tie-breaker for similar documents. So the higher the net total link flow, the more potential you have to rank higher for any given keyword search. It's not keyword-search dependent, it's simply to break a tie that is the ranking potential for one document over the other.
So let's go ahead and click on one of these - let me scroll down and see -- if you look at any of these results down here -- here's a good one right here. Let's click on this one right here, I'm going to show you a couple penalties to this page. So, this is the scorecard, the webpage scorecard screen. Now, every page on the internet is going to have one of these screens inside the SEO Engine. So you can actually go and see the summary and detailed penalties for the SEO Engine score. And this is the summary page that you'll come to first. This actually will show you how the SEO Engine score and the net total link flow score is broken down. It starts with what we call the gross total link flow, and think of this as just simply raw page rank. This is calculated by our billion page index, we run our specific algorithm that determines this link flow value, essentially it's a popularity content -- who's linking to who.
Of course, it's taking into account penalties that are applied to each link is actually scored, so it's not just a simple calculation, but there's millions of different calculations that go into this. This actually is going to give you a raw score. This is sort of the potential ranking path for this document. And so then the penalties are applied, and you can see right here that only 78% of the original potential ranking power is being utilized because of the penalties that are applied, and the net total link flow value right here.
So you can see the net total link flow value in the SEO Engine score are very closely related. The SEO Engine score is simply an efficiency ranking; it's saying I'm only at 78% efficiency, and you have the ability to actually correct this up to 100% and move this net total link flow value from 18.5 up to 23.5. So a significant increase in ranking power by simply just addressing some of the penalties that are being applied to a document. So if we click on this penalty factor on the detailed penalties tab, you actually see some of these core signals that are being run across each document. Now, think of these signals as sort of the general core of business practices that every search engine is going to apply to start increasing the effectiveness of its index. Every search engine has its own "special sauce" -- all these different signals that they've finely tuned and tweaked, and they continue to tweak every day.
What the SEO Engine does is it takes the common denominator between all the different search engines, and it takes these core principles that we know don't change over time. These are specific to all search engines, and this allows you to actually see all the major issues are happening on every search engine for this specific page. You can see these are signals -- they get applied to every page and every search engine has identified them as a signal that this page may be affiliated with spam or low-quality content. Now, obviously a lot of times documents will fall into this trap without even knowing it, and this is the greatness of the web page scorecard is you can actually identify what potential spam trap or signal that's been tripped is going on with your specific page or site. So, for instance, we have not quite enough of a number of unique words.
Obviously, search engines prefer unique content. The SEO Engine will actually remove a lot of the stop words, prepositions. We have over 10 million stop words in our search engine simply because we have this billion page index so that we can identify what words unique identify one page to another, and if that word doesn't do that, we actually will remove those and call that a stop word essentially. So these are actually unique words on the page. You can see that while there are enough unique words on here, there is a slight penalty because there's not enough on the unique side of things.
Keyword stuffed: this is a good signal that actually is run into quite a bit. All these you can actually look at. The penalty effect's right here, so you don't have to be a PhD to determine this -- you just simply look at what are the major penalties here, and address those one by one in the priority that you want and if you want to identify where the penalty is, you simply click on this penalty location, in this case, we're clicking on the "keyword stuffed" penalty, and this will take you to a keyword stuffed analysis. You can see the word "Venice" is occurring 34 times and the target right here is 30. We do that by looking at the average number of any word on this page, which is about four occurrences, and we take a statistical measure called standard deviation, and anything over a specific standard deviation will get penalized. And you can see right here we need to remove four inserts. If you want to actually see words that are actually appearing, you simply click on the word and then every screen you can always click on the graphical representation inside the engine, this is a cast view if you will, and it will highlight any of the words with the word "Venice." You can see right here, not a lot of content on this specific page. So that's an example right here of the navigability of the SEO Engine. You can go in and actually see what's going on.
Outgoing paid links -- obviously it's a moniker, and eventually a non-editorial link that's going to an outside or an external, outgoing link, outside sub-domain. Missing Alt Text on 7% of the images, and you have some duplicate meta descriptions, and so these are the penalties that are being applied to the specific document, and you can see immediately how to fix these specific issues, how everything's inter-related, all this stuff is actually on each scorecard. If you were to click on any one of these scores, you would go to that webpage scorecard for that specific screen and you can see how important that is. We talked about raising that score to 100% and this would go to 23. You can see that this would actually jump over this rank right here up to the third result simply by changing and upgrading the efficiency score for this specific document. And the SEO Engine obviously allows you to view your work, you can simply indicate it to recall the site. It will send its spider bots out, get it into the queue, actually it starts scoring and analyzing it over again and you'll be able to see the results of your work.
On the next page, I want to go over what's called the Market Focus. And this will probably be the most interesting to most of you because most of you are content writers. This is what a search engine thinks a webpage is all about. And obviously this is a very, very complicated space or idea to actually implement. We worked on this for over four years. This is a highly complicated algorithm that deals with a lot of semantic indexing, a lot of comparing the context of words that each word is in and this to show you how each document categorized or viewed from a search engine's perspective. The main takeaway from here is you want to make sure each market focus is unique. And we have a number of reports inside the SEO Engine. A number of screens will show you any overlapping market focuses for that specific site. And the reason behind that is that when a search engine provides results to its users, it very rarely provides the same documents with similar market focuses in the same search results.
So, for instance, if you have two pages with similar market focuses, the page that has more net total link flow will always get served first on a search engine result page, so essentially you're wasting the net total link flow of the one that's being masked for each search result because this document is too similar and so it's essentially wasted. It's not going to pull in any organic traffic and you'd be better off either changing that market focus, or increasing the net total link flow or removing the document and moving the net total link flow over to the one that actually is being served to give it a better chance of ranking high.
So let's click on one of these market focuses -- I'm going to click on the first result here, and this will take to what's called a market focus detail screen -- and we have two different variations of the market focus calculation. We have a three-word and a four-word market focus. The algorithm is the same; it just depends on the shingle or the number of groupings of words. If you have a very competitive market, then you're going to be working with a four-word market focus. If you have a not-very competitive market, you probably want to work with a three-word market focus. So this is really just up to you, you'll get used to it as you use this inside the SEO Engine, and let me take you through how the algorithm actually works and how you can utilize it when you write content.
So, the general idea is we take the phrases or the content on the page, we multiply it by some sort of weighting or vector by the anchor text, the incoming anchor text into that page, the links and all the power into that page, and that defines our market focus basket phrases. We actually have what we point out a newer keyword density report -- essentially, we filter out millions of non-unique words plus common stop words that are out there and we show the number of occurrences that are all these different phrases on the page. We then take the anchor text and we look at all the incoming links and associate a specific link flow to all of those incoming links. For instance, the activities. When you click on that link, you actually see all the incoming activities in the anchor text. You can see it here -- it's highlighted for "activities." And with the net total link flow, you're looking at a link listing screen right now, which we'll skip for now, this is more for off-page SEOs and link builders, but essentially what this is going to show you is the net total link flow share that's coming through a specific link.
In other words, how powerful is this link. You'll see the most powerful here as well as the least powerful on the link listing screens. And so if you total up all of the link flow shares, you actually see that the total Associated Link Flow value for that specific word. Now, what that becomes is sort of a weighting vector and the word "activities" shows up in the phrasing vector here. We actually multiply this weighting vector and we come up with a logarithmic weighting or rank of our market focus basket, you can see everywhere, there will be a lot of words with "activities," there's a lot of market focuses with "activities" in them. This is how you define or sculpt the basket phrases that will trigger your document to come up on Google. A very important concept.
All of these words are actually triggers to pull your document up. Now whether or not it has enough net total link flow to rank high for that specific market focus, or search, is a different matter. This is going to give you the ability to sculpt each document to have a specific unique concept or meaning inside of your site, giving your site the biggest and broadest and the most stable set of organic traffic coming into your site. Remember, you don't have to have all your organic traffic coming in on one or two pages, your top pages; you're better off making each page unique an equal player, or statistically equal in organic traffic for your site. So this is a very interesting page to look at. Each screen -- every document is going to have its own market focus detail screen, and so you can actually see how the market focus is calculated.
If you wanted to rank a different phrase better, say we wanted to do "Venice farmer's market," we wanted to either increase the number of occurrences of this phrase, of course without keyword stuffing, so you'll see the SEO Engines all tied together, so when you make a change you can go and recall the site, you'll be able to see immediately if you induced another penalty in a different area so there's a lot of different ways to solve a problem, but each one has a specific user interface consideration and as well as other considerations another penalty that may be applied when implementing that solution. So one of the ways is to actually increase the number of phrases for that, or you can take one of the words and change it to "farmers," or add a link, or remove, increase or lower the link flow share in this specific link. Obviously, these are more advanced techniques and if you become one of our SEO customers, we'll actually train you to understand some of these more advanced techniques, but suffice it to say there are a lot of different ways to sculpt each market focus basket to make sure that you have the phrases that you specifically want for each document. And of course, each one of these is going to have its own market focus detail screen You can click into any one of these at any time and see those calculations in real time. So it's really cool.
The next thing I want to go over is duplicate content. Obviously, this is a major issue, not so much with outside syndication of other sites, but just simply within a given website as you start writing more content for a given subdomain. A lot of times the correlation factor or the duplication throughout the site starts to rise. When you have a natural understanding or natural correlation in your site simply because you have things like footers, headers, navigation bars, that will automatically put your site in each page usually in the 20 to 30% correlation range against any other document in your site. What we do is we actually will show you duplicate content, if we go to click on any one of these screens here, this will actually take us to what we call a scoresheet screen. This is sort of an overview of each page -- it kind of groups together all of these specific screens, but what we can do is click on the content tab, and on the content tab, there's a number of really cool things that are going on, one of which, you can see all the meta data coming from the server related to the content, you can see the competition, go into our billion page index and actually be able to show you all the different web pages with a similar market focus basket. So we say, OK, who else has this set of phrases in its market focus basket and we pull out these top competitiors. And again, you can click on any of these, and go to their specific web page or web page scorecard or scoresheet and you can reverse-engineer what they're doing and try to copy that -- that's one strategy.
So for duplicate content, what we do is each document gets correlated against every other document on the site and we actually provide a correlating factor or percent duplication factor between this page and every other page. It will give you the top 20 right here. And if this gets above 50 to 60%, you'll start to see a penalty on its scorecard, and lowering the net total link flow and lowering the SEO Engine score as well, and in this case it's actually pretty low. You can actually click on that percentage duplication, just simply click on that, and it will show you the overlap between the page that I just clicked on so you can see right here, all the words right here that are being duplicated between these two pairs of documents. And as you can see there's enough content here besides the footer, header, navigation bar to actually lower that percentage of overlap to a reasonable number.
So when you're adding content, you want to make sure that you have these numbers down in check. And you'll see with the SEO Engine as you add content, and you recall it using the SEO Engine, it will actually show you if you encounter any issues. Your SEO Engine scores will start coming down, and you can click in there and you can actually see one of the scorecard signals is duplicate content.
So, just to review this, you can see the max matched percent duplicate content here, actually is the top level right here, and if this goes above a certain amount, you will actually start to accrue penalties. So you'll actually be able to see this, this won't be something that will actually creep up on you, this will definitely be something that will be very visible when you use the SEO Engine, this is the screen that kind of gives you the in depth analysis of correlation after each specific document.
So this is a great tool. As you add content to your site, to make sure you keep those penalties in check, make sure you keep your correlation factors in check. You're not adding content that overlapping on a lot of different sites. So of course you can also drive duplication with market focus as well, so everything's correlated and interrelated. The great thing about the SEO Engine is it's all part of one system. It's not a separate set of disparate SEO tools that you're used to. Everything's tied together so when things start to shift, you see them immediately and a number of different metrics on the SEO Engine.
So finally, basically what I want to do is just kind of go over any questions that you might have. Obviously, you're going to direct those to Byron. Essentially, what I've shown you here is the main content-related screens in here. There are quite a bit of other screens inside the system -- I'll go over this since I have a little more time -- you can click on the link flow distribution and this will actually show you what I was referring to before and actually highlight the duplicate market focuses. So you can see that one, these ones in red actually are duplicate, and it's very important that each document has its own unique market focus.
We also have a static solve the dashboard for each website, that will show you that as well. There's a number of content-related metrics throughout the system not just in the pages that I've shown you here, but again, I'd like to thank Byron for inviting me to his webinar here. I would like to invite everybody out there to read our latest article on this month's issue of "Visibility Magazine" entitled "Back to Basics." It does cover some of the Panda update content penalties from a search engine's perspective. If you have any questions, you can go to our website www.seoengine.com. It has a free SEO learning center, and training videos are free as well about the SEO Engine. So if you want to educate yourself how a search engine thinks, how it views the world, which is essentially the mysery in the last decade for internet marketers, go to the SEO Engine learning center and it will answer almost all of your questions that you need. So, that's it! I'm going to turn things back over to Byron and he can answer any questions that you guys might have.
Byron: Terrific, Scott, great presentation, amazing technology, and we're all, of course, dumbfounded by the different pieces of data that we can look at on a particular web page, which is a good thing. We have some super questions in, so I'm just going get to them right now and ask them.
I'll chime in as to answers to any of these questions as well, the first one was related to, "Do you have any examples of websites that were affected by Panda and if so, what did they do to rectify the problem?"
I've got a good one here. It's a customer so I can't list the name of it, but the company is sort of the "bed, bath" sort of space owned a Yahoo store. Their traffic and their revenue tumbled about 30 to 40% all in a matter of a very short period of time. Their biggest problem was duplicate content. They were taking descriptions from the manufacturer for a lot of their products. They had about 10 or 20,000 SKUs, and pretty much the majority of that content was all duplicate. So they went into emergency mode given that there were millions at stake and actually recreated about 12,00 actual SKUs with original content on each of them. They used our WriterAccess marketplace where you can connect directly with writers. They quickly entered the old SKUs and the assignment was to rewrite the descriptions and make it original content. There were some other style guideline restrictions; they wanted a little sizzle, make it interesting, and of course they did keyword density checks on all of the products. So it was a massive project, but it was successful. Their traffic is not back 100%, but it almost quickly -- they were actually smart about the way they rolled out the content as well, they rolled it out over a two or three month time period, so the spider bots were seeing that fresh content and some informational support pages, and the bottom line is that they did a sanity check and said, "Here's our problem, we know exactly what it is, let's face it." So there's an example.
Let's see, we had some other questions we want to get to... "Our blog was originally published on its own, and then published as a seed on the e-commerce website. This year we do not seem to be getting Google credit for the original content on the original blog. Do we have to publish first on the original to get the SEO credit?"
So that's an interesting question. I think what's probably going on there -- and feel free to dive in, Scott -- my guess is Google's seeing the original content, then they're trying to republish that content as an RSS feed on their e-commerce site, and it's like "Danger, Will Robinson! Duplicate content here!" It's already been published on the original site. Would you agree with that, Scott? It's probably even what your engine would detect as well.
Scott: Yeah, and I want to quickly add, go back on your last question and one of the things you can actually do about new content is actually identify or set up in your robots file to block the common search engine that allow the SEO search engine bots to come in and actually analyze what you've done before you actually pull the trigger. That's a really important concept because simply putting stuff out there is not really something you can take back. So to prevent an issue where you go and put something out there that causes drops in ranking. It's always easier to catch it and test it first with a real search engine like the SEO Engine and then go ahead and release that out to the real world, so you can do that through your robots file. So that's one thing that we like to do.
Byron: On an interesting side-note, over at WriterAccess, when someone purchases an article, we use Copyscape to validate that that article hasn't been published anywhere else. And it goes out and actually will gray out any word strings that it finds. Is that the type of technology that you're using as well? Literally scouring the web and looking for word matches, sentence matches and paragraph matches and finding that duplicate content, Scott?
Scott: Yes, it's very similar, a kind of homegrown version of Copyscape and we do a little more in depth analysis in a specific subdomain, so we'll actually show you that correlation factor, and then as far as how each document is accruing if that correlation incurs a penalty or not.
Byron: Nice. Sounds like you should whip that over to me so I can use it. Let's see...somebody had an interesting question about the "big beef" example -- "Could you explain the 'big beef' example that I showed?" I'd be happy to.
So, typically we laser-focus on the actual keyword phrase itself when we build in an internal link splattered onto a sentence. And let's say in this case the keyword phrase is "big beef." That first sentence that uses the word "big beef" go easy with ambivalence about whether "big beef" means argument or steak. So I'll read the sentence again: "I had a big beef with my boss yesterday. The steak was cooked to perfection and melted in my mouth." When you look at that, you don't know whether "big beef" means an argument or a steak. I've used this as an example countless times when I've spoken with customers. It's the words around that link phrase and that's really hard, I'm sure you can attest, Scott, to pick up on that. We spend so much emphasis on looking at the page that it came from, the page rank of that page, how many links you have coming to your website, but I wonder if Google is putting a lot of emphasis on the words around that link phrase, because certainly Google's technology is studying that closely. Do you have a comment on that, Scott?
Scott: Yeah, in fact, if you want to quickly send the presentation over to me, I can show you a screen that actually identifies that specific issue. What you're looking at right here is what's called a link scorecard, and we kind of skipped this to this point here. We actually are able to identify the relevance of a specific link, and again, like Byron says, there's a whole host of signals th, at go into it. We actually focus on the content around it, of course the market focus overlap between the two documents that are involved in the linking, and so, intrinsically, the calculations or the signals that go into each determination of the relevance of the link is actually calculated by thousands of different calculations in itself.
So it's sort of a recursive algorithm that get applied for a page, then a link, then a page, and it actually all rolls up nicely into a nice relevancy score. So we have a way to see how relevant, for instance, a meta title is, or a meta description, or a link between two different documents. So, this is an example of a link right here, and obviously it's a very relevant link. Let me see if I can go back and look at a link here that's maybe not as relevant. Let me look and see here... For every link you can click on a scorecard and actually see the penalties for that.
So here you go, here's an example of a link, and you can see that we've actually deducted 95% of the score or the linkflow going through this specific link. So even though this is how links are penalized, penalties for links aren't necessarily the same for web page penalty because that link flow flows to other links on the page. You don't lose that link flow. You're simply not getting as much power through that specific link. So if you're buying or selling links, or doing something with links internally that you're structuring the link structure, you want to be able to see this so that you understand that in this specific case, not much link flow is going through this specific link.
You can see, obviously, there's different levels of relevancy. There's not relevant, and there's a couple levels in between, and excellent, so all of these are factors in how each link is being scored, and so if you have any questions about whether or not a link is relevant, from a human eye, like Byron's perfect example, it might look just fine or passable; maybe that would work a decade ago, but I can tell you right now, it doesn't work on the SEO Engine, and it probably doesn't work on Google, Yahoo, Bing and some of the other major search engines. So that's one of the cool things you can use the SEO Engine for.
Byron: Question: "In the past, search words throughout your website were king. It sounds like you were not advocating duplicate content. How do you find the balance?" So, the person that asked this question needs to understand that we are advocating using keywords and content and using them appropriately according to best practice. When we use the phrase "duplicate," duplicate can mean a lot of things to a lot of people, and I think it means something a little different to the person that asked this question. "Duplicate" doesn't mean duplicate individual words. Duplicate content tends to mean more of duplicate actual sentences and duplicate paragraphs and actual duplicate chunks of content that you can see why Google and the search engines would not want you to do that. They want to choose one of those pages that features that original content on it, not numerous pages that list all those numerous pages. So the search engine needs to make a decision -- which of these pages is most relevant, and most popular, and most appropriate for us to display in our search results. So that's sort of where that comes into play. So your stuffed keywords that you have on the screen right now is probably the answer to that, but do you want to expand upon that at all, Scott?
Scott: Yes, just keep in mind that it's sort of a game of "whack-a-mole." You fix one problem on one side of your site by doing a specific action, and you want to make sure you're not shooting yourself in the foot by inducing another penalty on a totally different signal. So, just keep in mind that if you have a tool like this, something that can give you an overview of all of the penalties that are going on it real time, that will give you an idea and kind of keep you from falling into that trap, as you add more content, you're not overdoing it or inducing things like keyword stuffed penalties or having any matched duplicate content.
Byron: Another great question here: "How will republishing content in the social sphere affect the bots from reading it as duplicate content?" So I'll take a first crack at that. I'm a big fan of publishing content out in the social sphere. There isn't a lot of room. For example, on Twitter to publish an entire article, or a summary of your blog post, really what you're doing is you're really trying to drive traffic and drive the bots from recognition to that particular link. You're actually hoping that people re-Tweet that and push that out once you make that post and the more people that are out Tweeting and linking to a particular page, the better. I would not at all worry about duplicate content in the social sphere. That's really what the social sphere is all about: people talking about the same content asset or the same page. So, much different subject than the duplicate content issue we saw on the e-commerce site where thousands of potential websites were potentially using the exact same copy on the exact same page, leaving Google with no choice but to choose one of those thousands of people to list the actual product.
Scott: I answer that from a search engine's perspective. There is definitely a timeline and sort of a time stamp on every document inside of a search engine when it was first crawled and we do have algorithms where we actually will attempt to identify the first version of that content. This is a very important feature to, Copyscape, for instance. One of the things, maybe not so much in a social sphere, if I'm not misunderstanding the original question, but he might have meant essentially press releases, anything you can put out there that may be also on your site. And if that's the case -- tip #1 -- before you do a press release or before you do anything that you're going to give or syndicate out there, just put it on your site, make sure you verify that it's been "crawled," and that's really the major thing you can do to prevent Google or any major search engine from tagging you as a syndication or copied content.
Byron: Someone has a question, Scott, and I'm going to have you answer this, and I'm going to add something to their question: "What are the effects of using a subdomain for content? Does the value of content on subdomains pass to the base domain?" And before you answer that, can you talk about the distinction between, say, www.contentmarketing.idealaunch.com vs. IdeaLaunch.com/contentmarketing? Do your engines make those distinctions?
Scott: Yes, so it's not so much delineated by subdomains anymore, it's really, if you look at a graph of the internet inside of a search engine, it actually has no boundaries. So, what we do is we actually group things into what we call "neighborhoods." And essentially what these are are tightly-grouped documents inside the graph of the internet and it could be within subdomains, or however way that you dream it up, as long as it is interconnected, of tightly connected with the groupings of other documents, it's going to view it as the same neighborhood. Some of the things that we will actually show you is actually how well your neighborhood is doing around a specific page, or around a specific link, or a specific subdomain.
So, to answer your question, as you add content, not only is data you are updating being affected, but everything around it is being affected. Think of it as "the butterfly effect," and it really, truly is a "butterfly effect." You can change one thing inside of a neighborhood and it will disturb that neighborhood a bit so that you'll see some perturbations of different scores within documents in that neighborhood. So it really doesn't matter whether or not you're putting out a subdomain, or if you come up with a trickier version of maybe putting it on a one-way linked site, or whatever. For modern search engines today, we really don't care. We're past that point. We've already algorithmically discovered how to nullify any kind of things that are going on as far as the organization of there the content exists. It really just depends on who's linking to who, what the neighborhood definition is, and that kind of sets the boundaries of what affects what.
Byron: Excellent answer. Another question came in that I'm going to try to interpret and it was regarding blog content and paragraph teasers for those blogs and whether that's viewed as duplicate content, and how the search engines are seeing that as potential duplicate content problem. So for example, a page of your blog might have the latest five posts, and it might have the first four sentences, or three sentences, and then you click to learn more, and it opens up to that whole page. Are the bots seeing that as duplicate content issues, are you being penalized for that, is it affecting your blog? What are your thoughts on that?
Scott: Sure. Well, there's two different questions, and you rightly asked both of those questions. Is it seeing the duplicate content? The answer is yes. And is it penalizing for the duplicate content? And really, that becomes a factor of how the level of correlation is. That's why this is so important to actually be able to see the correlation inside of one's site. In this case, obviously the correlation factor is within reasonable bounds. But let's say you don't have a lot of content other than those snippets of each of your five snippets on that page. Well, then those actually will have a significant amount of correlation between the home page and those pages that are actually identified by those snippets. So, it really is two different questions. You're always correlating, and the search engines are always comparing the difference, the vector between two different documents, how closely similar they are, and really, it just comes down to how well correlated are those documents. You can have these snippets and pages like that.
Really, where you get into trouble is where you get tab pages or pages that have really almost exactly the same content, they're just in a different format. In those cases, obviously you're going to want to block in the robots file one version of those pages. You always want to make sure that the search engine's only going to see the same one version of the document at the most. As far as snippets and things like that where you get content management systems like Joomla and WordPress, those are going to be a hit-and-miss. It really depends on the situation, the context, and you really can't, from a human's perspective, understand what's going on, unless you use a tool like this where you can actually see the correlation factor going on to be sure you're within bounds. You're not inducing any penalties on that specific page.
Byron: Scott, we're out of time, and I really want to thank you for being on the show today.
Scott: It was a pleasure, Byron! I hope everybody gained a lot of information from our presentation. And obviously, if you have any questions for me or Byron, my door's an open door, so I'd like to hear from you guys.
Byron: Thanks again, everyone. If you could send any compliments on this particular webinar, I'd really appreciate it -- email@example.com. You can also Tweet me with anything that you found positive or interesting. I really appreciate that. It's @byronwhite. Thanks very much for tuning in, everyone, and I'll see you next month. Look forward to another great webinar. Thanks again, Scott, and I really appreciate everybody tuning in today. Thanks everyone! Goodbye.