A First Look: Analyzing How Google Ranks Webpages

Research and Analysis by Changyue Jack An

Google webpage rankings are largely determined by the number of links pointing out of a webpage and the number of links pointing into a webpage. We can consider a webpage as one of two positions, as either a recommender or a recommended page. The more important your webpage is considered to be, the more highly valued your external links will be. Similarly, your webpage can gain status and climb the Google rankings if a very important webpage recommends your site via outlinks. A letter of recommendation from someone like Warren Buffett, for example, is a much stronger recommendation than one from a stranger.

Before we dive into Google rankings and how to accelerate your growth, there are several terminologies that need to be covered. The first term to discuss is called a “hub”, or a webpage that only contains outlinks. We imagine a hug as a very generous website but with little power. The second term is called “authority”, also known as a “dangling node” or “end point”. These are pages that have only inlinks but contain no outlinks. The final term we need to know is the “general webpage” which contains both inlinks and outlinks. We’ll be using these terminologies as well as the exhibits in our appendix to explain several scenarios that occur when webpages use links as part of their growth strategy to rank higher on Google.

The First Scenario we’ll examine is based on the diagrams in our appendix. In Exhibit 1, let’s assume that node 3 is our company’s webpage. We see based on our diagram that it appears to be an important page because it has the most number of inlinks and outlinks. Our page also connects the upper triangle relationship with the lower triangle relationship, resulting in six nodes (webpages) total. This tells Google to score each node equally as 1/6.

The First Scenario Matrix

This initial matrix will let Google use a normalized hyperlink matrix that we’ll denote as the following:

Normalized Hyperlink Matrix

Google will use this to determine if there is a link from node i to j, and 0 if there is no connection. Pi is the number of outlinks from node j. So an example might be:

Normalized Matrix

You’ll notice though that node 1 doesn’t connect to node 6, so H16=0. Our final H matrix will look like this:

P1P2P3P4P5P6
P101/21/2000
P2000000
P31/31/3001/30
P400001/21/2
P50001/201/2
P6000100

 

Next, we’ll find the Google matrix:Google Matrix

α’ is the amount of time an individual follows the hyperlink structure against being taken to a new website. Based on the Google page ranking book, we’ll use α is 0.9; H is still the normalized hyperlink matrix; ‘a’ is a binary dangling node vector. Here, we have one dangling node which will be node 2.

Row Vector of All 1'sThis is the row vector of all 1’s and e is the column vector of all 1’s. “n” is the number of pages, so we also get the following: Row Vector

Now, use this: Equation to Calculate Final Rank

to calculate the final rank of each page, we get

Converging Ranking

and so forth until we get

Convergence

that is convergent.

 

Node1Node2Node3Node4Node5Node6
Final Score0.05170.07370.05740.34870.19990.2686

 

Based on our calculations, the final rank of the six pages we’re given is Node 4 > Node 6 > Node 5 > Node 2 > Node 3 > Node 1. This result disproves our initial assumption of node 3 being the most important node.

The Second Scenario (Exhibit 2) that we’ll example is derived from scenario 1 which we just outlined above. This time, we’ll eliminate the outlink from node 3 to node 2 to see if we can alter our final ranking results:

Final Ranking Results

Our H matrix now looks like this:

P1P2P3P4P5P6
P101/21/2000
P2000000
P31/20001/20
P400001/21/2
P50001/201/2
P6000100

H Matrix Equation

Our final score should now be:

Node1Node2Node3Node4Node5Node6
Final Score0.04170.04170.04170.37650.21110.2874

 

You’ll notice that our new final ranking is now Node 4 > Node 6 > Node 5 > Node 3 = Node 2 = Node 1. Node 3 increases from a rank of fifth to fourth place, but nodes 1, 2, and 3 all share the same ranking. We thus can determine that if our company’s webpage connects to a dangling node that this will hurt our rankings but will benefit the dangling node. We can eliminate this problem by disconnecting the dangling node from our page.

The Third Scenario (Exhibit 3) that we’ll look at is also derived from scenario 1. Node 2 is no longer a dangling node and it is now a general website that has both inlinks and outlinks.

Third Scenario Equation

Our H matrix is the following:

P1P2P3P4P5P6
P101/21/2000
P2001000
P31/31/3001/30
P400001/21/2
P50001/201/2
P6000100

Third Scenario Equation

 

Our final score is now:

Node1Node2Node3Node4Node5Node6
Final Score0.05980.08520.12290.30620.19000.2359

 

You’ll notice that the new ranking of Scenario 3 is Node 4 > Node 6 > Node 5 > Node 3 > Node 2 > Node 1. Node 3 is still the same rank as in Scenario 2 but it has a final score that is greater than Node 2 and Node 1. We can conclude that if our company’s webpage is connected to a general page instead of a dangling node that the webpage rank will benefit and the page score is not the same as the others.

Let’s go over the scenarios we have. So far we’ve discussed one basic scenario and two scenarios that show us how to avoid lowering our webpage ranking. Now we will move forward and discuss and how to increase our page ranking and which nodes we can use to benefit our page.

The Fourth Scenario (Exhibit 4) that we’ll examine will assume Node 3 has the same number of outlinks and inlinks. Think of Node 3 as being in perfect balance like Node 4.

Fourth Scenario Matrix

Our H matrix is now the following:

P1P2P3P4P5P6
P101/21/2000
P2001000
P31/31/3001/30
P400001/21/2
P5001/31/301/3
P6000100

Fourth Scenario Equation

Our final score for the Nodes should now be:

Node1Node2Node3Node4Node5Node6
Final Score0.08680.12380.21830.22060.18060.1699

 

You’ll notice that the new ranking is Node 4 > Node 3 > Node 5 > Node 6 > Node 2 > Node 1. We can conclude that if we can balance the number of inlinks and outlinks, our page ranking will increase.

So how do you get your webpage to rank the highest on Google? Remember, Google will consider your webpage important if it is referred by other important pages. Generating a link from Node 4 to Node 3, our company’s webpage, should prove this theory correct.

The Fifth Scenario (Exhibit 5) will examine how we can begin to increase our Google page rank.

Fifth Scenario Matrix

The new H Matrix is now:

P1P2P3P4P5P6
P101/21/2000
P2001000
P31/31/3001/30
P4001/301/31/3
P5001/31/301/3
P6000100

Fifth Scenario Equation

And our final score is now:

Node1Node2Node3Node4Node5Node6
Final Score0.10710.15530.30140.16590.15690.1135

 

You’ll notice that the new ranking is Node 3 > Node 4 > Node 5 > Node 6 > Node 1.

This scenario thus proves what we’ve hypothesized all along: recommendations from important pages will help your page ranking.

Now let’s assume you have a difficult time balancing inlinks and outlinks and a very important website just pointed to your webpage. How will this imbalance affect your ranking?

Conclusion.

So what can we conclude from the seven scenarios we’ve just examined? These scenarios give us a basic understanding of how Google ranks pages and now we can strategically build our pages to take advantage of what we’ve discovered:

  1. Attract important webpages to point to our webpage by improving our web design and content
  2. Balance the number of inlinks we receive with outlinks that point to other important pages
  3. Try and avoid connecting to dangling nodes, which will negatively impact our ranking

Appendix.

Exhibit 1

Exhibit 2

Exhibit 3

Exhibit 4

Exhibit 5