Research and Analysis by Changyue Jack An
Google webpage rankings are largely determined by the number of links pointing out of a webpage and the number of links pointing into a webpage. We can consider a webpage as one of two positions, as either a recommender or a recommended page. The more important your webpage is considered to be, the more highly valued your external links will be. Similarly, your webpage can gain status and climb the Google rankings if a very important webpage recommends your site via outlinks. A letter of recommendation from someone like Warren Buffett, for example, is a much stronger recommendation than one from a stranger.
Before we dive into Google rankings and how to accelerate your growth, there are several terminologies that need to be covered. The first term to discuss is called a “hub”, or a webpage that only contains outlinks. We imagine a hug as a very generous website but with little power. The second term is called “authority”, also known as a “dangling node” or “end point”. These are pages that have only inlinks but contain no outlinks. The final term we need to know is the “general webpage” which contains both inlinks and outlinks. We’ll be using these terminologies as well as the exhibits in our appendix to explain several scenarios that occur when webpages use links as part of their growth strategy to rank higher on Google.
The First Scenario we’ll examine is based on the diagrams in our appendix. In Exhibit 1, let’s assume that node 3 is our company’s webpage. We see based on our diagram that it appears to be an important page because it has the most number of inlinks and outlinks. Our page also connects the upper triangle relationship with the lower triangle relationship, resulting in six nodes (webpages) total. This tells Google to score each node equally as 1/6.
This initial matrix will let Google use a normalized hyperlink matrix that we’ll denote as the following:
Google will use this to determine if there is a link from node i to j, and 0 if there is no connection. Pi is the number of outlinks from node j. So an example might be:
You’ll notice though that node 1 doesn’t connect to node 6, so H16=0. Our final H matrix will look like this:
P1 | P2 | P3 | P4 | P5 | P6 | |
P1 | 0 | 1/2 | 1/2 | 0 | 0 | 0 |
P2 | 0 | 0 | 0 | 0 | 0 | 0 |
P3 | 1/3 | 1/3 | 0 | 0 | 1/3 | 0 |
P4 | 0 | 0 | 0 | 0 | 1/2 | 1/2 |
P5 | 0 | 0 | 0 | 1/2 | 0 | 1/2 |
P6 | 0 | 0 | 0 | 1 | 0 | 0 |
Next, we’ll find the Google matrix:
‘α’ is the amount of time an individual follows the hyperlink structure against being taken to a new website. Based on the Google page ranking book, we’ll use α is 0.9; H is still the normalized hyperlink matrix; ‘a’ is a binary dangling node vector. Here, we have one dangling node which will be node 2.
This is the row vector of all 1’s and e is the column vector of all 1’s. “n” is the number of pages, so we also get the following:
Now, use this:
to calculate the final rank of each page, we get
and so forth until we get
that is convergent.
Node1 | Node2 | Node3 | Node4 | Node5 | Node6 | |
Final Score | 0.0517 | 0.0737 | 0.0574 | 0.3487 | 0.1999 | 0.2686 |
Based on our calculations, the final rank of the six pages we’re given is Node 4 > Node 6 > Node 5 > Node 2 > Node 3 > Node 1. This result disproves our initial assumption of node 3 being the most important node.
The Second Scenario (Exhibit 2) that we’ll example is derived from scenario 1 which we just outlined above. This time, we’ll eliminate the outlink from node 3 to node 2 to see if we can alter our final ranking results:
Our H matrix now looks like this:
P1 | P2 | P3 | P4 | P5 | P6 | |
P1 | 0 | 1/2 | 1/2 | 0 | 0 | 0 |
P2 | 0 | 0 | 0 | 0 | 0 | 0 |
P3 | 1/2 | 0 | 0 | 0 | 1/2 | 0 |
P4 | 0 | 0 | 0 | 0 | 1/2 | 1/2 |
P5 | 0 | 0 | 0 | 1/2 | 0 | 1/2 |
P6 | 0 | 0 | 0 | 1 | 0 | 0 |
Our final score should now be:
Node1 | Node2 | Node3 | Node4 | Node5 | Node6 | |
Final Score | 0.0417 | 0.0417 | 0.0417 | 0.3765 | 0.2111 | 0.2874 |
You’ll notice that our new final ranking is now Node 4 > Node 6 > Node 5 > Node 3 = Node 2 = Node 1. Node 3 increases from a rank of fifth to fourth place, but nodes 1, 2, and 3 all share the same ranking. We thus can determine that if our company’s webpage connects to a dangling node that this will hurt our rankings but will benefit the dangling node. We can eliminate this problem by disconnecting the dangling node from our page.
The Third Scenario (Exhibit 3) that we’ll look at is also derived from scenario 1. Node 2 is no longer a dangling node and it is now a general website that has both inlinks and outlinks.
Our H matrix is the following:
P1 | P2 | P3 | P4 | P5 | P6 | |
P1 | 0 | 1/2 | 1/2 | 0 | 0 | 0 |
P2 | 0 | 0 | 1 | 0 | 0 | 0 |
P3 | 1/3 | 1/3 | 0 | 0 | 1/3 | 0 |
P4 | 0 | 0 | 0 | 0 | 1/2 | 1/2 |
P5 | 0 | 0 | 0 | 1/2 | 0 | 1/2 |
P6 | 0 | 0 | 0 | 1 | 0 | 0 |
Our final score is now:
Node1 | Node2 | Node3 | Node4 | Node5 | Node6 | |
Final Score | 0.0598 | 0.0852 | 0.1229 | 0.3062 | 0.1900 | 0.2359 |
You’ll notice that the new ranking of Scenario 3 is Node 4 > Node 6 > Node 5 > Node 3 > Node 2 > Node 1. Node 3 is still the same rank as in Scenario 2 but it has a final score that is greater than Node 2 and Node 1. We can conclude that if our company’s webpage is connected to a general page instead of a dangling node that the webpage rank will benefit and the page score is not the same as the others.
Let’s go over the scenarios we have. So far we’ve discussed one basic scenario and two scenarios that show us how to avoid lowering our webpage ranking. Now we will move forward and discuss and how to increase our page ranking and which nodes we can use to benefit our page.
The Fourth Scenario (Exhibit 4) that we’ll examine will assume Node 3 has the same number of outlinks and inlinks. Think of Node 3 as being in perfect balance like Node 4.
Our H matrix is now the following:
P1 | P2 | P3 | P4 | P5 | P6 | |
P1 | 0 | 1/2 | 1/2 | 0 | 0 | 0 |
P2 | 0 | 0 | 1 | 0 | 0 | 0 |
P3 | 1/3 | 1/3 | 0 | 0 | 1/3 | 0 |
P4 | 0 | 0 | 0 | 0 | 1/2 | 1/2 |
P5 | 0 | 0 | 1/3 | 1/3 | 0 | 1/3 |
P6 | 0 | 0 | 0 | 1 | 0 | 0 |
Our final score for the Nodes should now be:
Node1 | Node2 | Node3 | Node4 | Node5 | Node6 | |
Final Score | 0.0868 | 0.1238 | 0.2183 | 0.2206 | 0.1806 | 0.1699 |
You’ll notice that the new ranking is Node 4 > Node 3 > Node 5 > Node 6 > Node 2 > Node 1. We can conclude that if we can balance the number of inlinks and outlinks, our page ranking will increase.
So how do you get your webpage to rank the highest on Google? Remember, Google will consider your webpage important if it is referred by other important pages. Generating a link from Node 4 to Node 3, our company’s webpage, should prove this theory correct.
The Fifth Scenario (Exhibit 5) will examine how we can begin to increase our Google page rank.
The new H Matrix is now:
P1 | P2 | P3 | P4 | P5 | P6 | |
P1 | 0 | 1/2 | 1/2 | 0 | 0 | 0 |
P2 | 0 | 0 | 1 | 0 | 0 | 0 |
P3 | 1/3 | 1/3 | 0 | 0 | 1/3 | 0 |
P4 | 0 | 0 | 1/3 | 0 | 1/3 | 1/3 |
P5 | 0 | 0 | 1/3 | 1/3 | 0 | 1/3 |
P6 | 0 | 0 | 0 | 1 | 0 | 0 |
And our final score is now:
Node1 | Node2 | Node3 | Node4 | Node5 | Node6 | |
Final Score | 0.1071 | 0.1553 | 0.3014 | 0.1659 | 0.1569 | 0.1135 |
You’ll notice that the new ranking is Node 3 > Node 4 > Node 5 > Node 6 > Node 1.
This scenario thus proves what we’ve hypothesized all along: recommendations from important pages will help your page ranking.
Now let’s assume you have a difficult time balancing inlinks and outlinks and a very important website just pointed to your webpage. How will this imbalance affect your ranking?
Conclusion.
So what can we conclude from the seven scenarios we’ve just examined? These scenarios give us a basic understanding of how Google ranks pages and now we can strategically build our pages to take advantage of what we’ve discovered:
- Attract important webpages to point to our webpage by improving our web design and content
- Balance the number of inlinks we receive with outlinks that point to other important pages
- Try and avoid connecting to dangling nodes, which will negatively impact our ranking
Appendix.