SEO and JavaScript – Paraphrase Online

Creating websites requires the use of many technologies and solutions. The rapid development of the industry and the growing expectations of users make it necessary to find new ways to present content. The appearance of websites and the way their content is rendered, unfortunately, does not always match the expectations of Google, which has had problems with rendering pages based on JavaScript for a long time. Until today JS website positioning requires a special approach …

Theory, or what is this creation – JavaScript?

The website is in the form of code that is read by browsers and rendered as indicated. The next elements of the website are added via HTML tags. The appearance of the page requires adding an additional file, written in css. The most common styles are located in the style.css file.

Websites built on HTML and CSS are relatively simple. Creating complex websites requires the use of more technically advanced solutions, allowing for dynamic content delivery, HTML rendering at the server level, and processing information from users and saving them in a database. The use of additional technologies also makes the form of providing information on the website more attractive.

Building dynamic applications is possible thanks to a wide range of programming languages, extending the functionality of websites. One of the most popular programming languages in recent years has been JavaScript. This is confirmed, inter alia, by The 2020 State of the Octoverse report summarizing the activities of developers within the GitHub platform.

JavaScript was created in late 1995. Initially, it was used primarily to validate information entered by the user (e.g. the correctness of an e-mail address in a form), today it allows you to build dynamic websites. Thanks to the scripts, the website “lives” and reacts to the user’s actions – the image gallery becomes interactive, pop-ups can be closed, and Google Analytics collects information about the traffic on the site. JS is also responsible for other elements without which websites do not seem to meet modern standards:
– infinite scroll (i.e. loading subsequent elements, e.g. products without reloading the page),
– comments and ratings,
– internal linking,
– “the best” lists of products or articles.

The choice of JavaScript as the technology on which the website will be based allows you to place on it elements downloaded from external sources, for example the Google Maps API or API from social networks.

JavaScript code is in a file saved as .js (usually script.js) – a link to it is inserted in the head section of the page or immediately before the final body tag (which is recommended). However, some code fragments are also placed directly between html tags (e.g. the script responsible for Google Analytics), which allows the code to be executed (i.e. the appearance of a specific action on the page) before the entire HTML and CSS structure is loaded. Unfortunately, such treatments usually have a negative impact on the rendering speed of the website – so it is worth using this option to a limited extent.

Within websites built according to this standard, at first the HTML structure is loaded, which is then supplemented with CSS code. Finally, the JS code is executed according to the sequence of the following items in the file – top to bottom.

Client-side Rendering or Server Side Rendering?

JavaScript is a specific language – it can be executed on the server side (server-side) and on the browser side (client-side). Both options allow you to build a modern web application that is attractive to users and web robots.

Server-side rendering (SSR) is the most popular method of displaying web pages. The browser sends information to the server, which sends the response in the form of rendered HTML, which is displayed on our screen. The response speed depends on, among others:
– internet connection,
– server location (distance from the computer from which the query was sent),
– traffic on a given page (how many inquiries are sent at the same time),
– from website optimization, e.g. cache and the possibility of storing some files in the browser cache).

Each subsequent click causes the necessity to reload the page – the server sends a response containing HTML with the same elements as the previous subpage.

Client-side Rendering allows you to render responses on the client side – most often the web browser. In this case, in response to the browser request, the server sends a JavaScript file, which is responsible for creating the HTML structure and adding content to it. In the case of moving to the next subpage, the website does not have to be reloaded – JS downloads the necessary content from the server and complements the previously rendered HTML skeleton. The server response is faster and the Internet connection is less loaded.

Client-side rendering is usually faster, but due to the need to execute all JS code, the first loading of the website may be longer than with SSR. Client-side rendering – rendering a view only for a given browser may cause problems with effective website positioning.

Google vs. JS – a bumpy road to success

The possibilities offered by JavaScript code had a significant impact on its popularity. JS allows, among others to create dynamic pages, based on a template written in html + css, supplemented with data downloaded, for example, from databases. Moreover, the potential of this language allows you to manipulate templates, create additional elements and render the page “on the fly” – while the program is running. On this principle, in the first years of its existence, Wix site builder – rendered websites based on JS code.

Unfortunately, as I mentioned above, this solution is not conducive to ranking high in the Google search engine. Network robots from the Mountain View company for many years were not able to analyze JavaScript pages well, which in turn resulted in the inability to compete in SERPs. In recent years, Google has declared to improve its capabilities in this area, but the effectiveness of reading the code files is not always satisfactory.

SEO JavaScript pages requires, first of all, to know how Googlebot processes JS code. For static pages, the robot’s flowchart is quite simple.

Google checks if it can enter a given address (access for robots can be blocked in the robots.txt file or tag in <head>), then downloads the page data – HTML structure and at the same time checks what is on all links in the code. Then the CSS files are read and the page is sent for indexing. This process is different for JavaScript pages.

As you can see, the path that Googlebot has to travel is much more complicated. After downloading the HTML file, it downloads the CSS and JS code which is necessary to render the page. It then supplements it with resources from external sources (if any) and renders the page. The appearance of the page and the necessary elements of its structure are contained in the JS code, which allows you to manipulate individual fragments and adapt them to the user’s needs.

Rendering the code before it is indexed may take a long time and Google does not guarantee that it will get all the information we wanted to include on the page. It has to do with with the number of URLs the robot scans during the day – the so-called “Crawl budget”. Taking into account the needs of Googlebot allows it to effectively crawl under the page, which translates into the visibility of the site in search results.

Drawing a Googlebot path gives a clear overview of the complexity of positioning JavaScript-based pages. Unfortunately, it does not inform about the risks and problems that may arise during the next steps.

JS pages – what does Google see?

Websites based largely on JavaScript code have always been a big challenge for Google robots. Their indexation level is increasing, but they are still not as high as we could wish for.

Google renders pages differently than the average browser, and uses the page in a different way than the user. The algorithm of operation focuses primarily on the elements that are necessary to render the website. It may omit those it deems less important and, as a result, ignore them during indexing. This is problematic especially in a situation when these fragments contain the content of the page, which was intended to become a ticket to high positions in search results. Data dependent on cookies are particularly exposed to the risk of remaining invisible – if the content is served on their basis, Google will probably not reach it. Additional issues are also the speed at which the code is executed – bad optimization may extend the entire process and cause the robot to abandon it.

The second problem is the lack of Googlebot activity when visiting the website – Google does not scroll the page, does not click where the user is and blocks automatic video playback. Unlike other visitors to the site, it may not reach all the prepared content and not have a complete picture of the site.

Information on how Google renders our site can be obtained using the Mobile Optimization Test. After entering the address, Mobile Test will download the subpage indicated by us and return information about the rendering process – including messages about problems. A preview of the rendered page will also appear on the right.

You can also check the page in Google Search Console – “URL Check”. Both forms of site rendering control allow you to obtain the data necessary to introduce any changes and improve the indexation of the site based on JS.

Single Page Apps – React, Vue.js and other frameworks

Strong emphasis on the speed of data delivery and the popularity of technologies used in mobile applications has resulted in the growing popularity of Single Page Apps websites. This type of website has one html file, which is used to render subsequent subpages. The content is downloaded dynamically from the server through JS requests.

SPAs are fast and well received by users. During the visit, the browser downloads all static elements, the remaining content is read during the visit. Transitions between successive fragments of the page are dynamic – we avoid the moment of reloading the page. From the user’s position, it looks the same as in the case of more traditional pages – e.g. links in the menu have a familiar form, but after clicking on them, the browser does not download data from the next html file – in fact it remains in the basic index.html, and the content is downloaded are from the database and rendered with JS.

To build Single Page Apps, the AJAX model is used, which allows communication with the server in an asynchronous manner, which does not require the document to be refreshed with each user interaction with the site. Pages are built with what Wikipedia refers to as the “application framework” framework. The framework is responsible for the application structure, mechanism of operation, components and necessary libraries. The most popular frameworks include React, which is used to build application interfaces. Vue.js and Ember.js are also frequently used. The framework developed and promoted by Google is called “Angular”. The most popular frameworks allow for server-side rendering of websites (which is recommended from the perspective of easy page crawling by Googlebot) and take into account the requirements of mobile browsers.

As we mentioned earlier, Google is not always able to deal with these types of sites, which makes the positioning of some Single Page App based on JavaScript without proper optimization may be impossible. A good example is the website. The website provides a lot of historical information (taken from Wikipedia), but Google does not see its potential.

While in the case of a website of this type, the lack of presence in Google is not that significant, in the case of a store it can significantly affect the interest of customers. Pages created using popular frameworks – React, Vue.js or Angular, allow you to introduce elements necessary to appear in the Google index and serve content in a way that allows you to compete in terms of position.

SEO and optimization of JS pages

The solutions enabled by the JavaScript code have a positive impact on the speed of the website and its reception by users. However, seeing the page by potential customers is only half the battle – most of them reach our URL only when they find it in Google search results.

Optimizing web pages with a lot of JavaScript code requires changes that seem obvious to static pages.

Access for network robots
As mentioned earlier, Googlebot will first check if the URL it encounters is accessible to it. This applies to all resources within it – including JS and CSS files. Google’s rendering of JS pages requires full code access, so avoid blocking these resources in your robots.txt file.

Urls
One of the problems that can be encountered when optimizing a site based largely on JS, is the lack of “traditional” links to subsequent pages of the site. Googlebot focuses only on the links placed in the href attribute, in the <a> tag.

Another problem with URLs typical of sites with lots of JS is the use of URLs with “#”. The cross allows you to avoid the need to scroll through the document and move directly to the selected fragment.

One solution to these problems is to use the HTML5 History API. It allows you to manipulate the browser’s history – e.g. change the address in the browser’s address bar and change the content, without having to reload the subpage. The API is built into most frameworks (including React Router), which makes it much easier to build websites that allow for positioning in Google. Note – such solutions will not be effective with old versions of browsers.

Among other problems with URLs, it is also worth noting quite a lot of potential for duplicate pages, with addresses that differ in case of letters or slash at the end. Placing a canonical link in the <head> section fixes this problem. The address they contain will be unambiguous information for Google.

Sitemap.xml
The element that makes indexing the site much easier are files containing links to all addresses within the domain – sitemap. The sitemap is a list to which all addresses to which we want to invite the bot should be included.

Redirects and the problem of apparent errors 404
One of the important elements of the website optimization process is catching broken URLs and directing them to new, correct ones. Redirects 301, 302, etc. are performed within the server. However, in the case of the Single Page App, this solution cannot be applied. As the Google Help Center suggests, and as proven by Search Engine Land’s testing, proper use of JS code will work in a similar way.

Redirecting JS will allow you to redirect the address to the next subpage, but it will not give the answer typical for redirects made on the server. However, from the point of view of presence in search results, this is not a problem. Redirecting effectively replaces the old address with a new one that may go further in the ranking.

window.location.replace replaces the current document with the one at the given address. The address and contents of the original document will not be cached. This type of procedure is another example of the effective use of the previously mentioned HTML5 History API.

Other important server response codes for network robots include 404, “Not Found”. In the case of JS pages (although this problem also occurs with badly configured static pages), the lack of a document at the given address will be read by the server as a correct 200 answer. As a result, the search engine index may find non-existent addresses with no content. In order to overcome this problem and inform robots about the need to look at other subpages, it is worth supplementing the code with a fragment that allows you to get the desired answer from the server.

Loading delay
Page speed is one of the factors contributing to good website visibility. The optimal speed of content delivery and other content elements allows for comfortable use of the website without overloading the user’s internet connection. One of the most effective solutions is lazy loading – delaying the loading of certain website elements. Note – incorrect implementation may block access to important website resources and, as a result, prevent Googlebot from accessing key content for positioning.

When loading, priority should be given to the HTML structure that allows you to “build the page” and its content. Next in the queue are the graphics, which most often have the greatest impact on the amount of data downloaded to load your site. The use of lazy loading allows you to render successive elements visible on the screen in the first one, allowing for later loading of fragments available after scrolling below.

Titles, descriptions …
You can’t do effective SEO without optimizing your titles and descriptions. In the case of Single Page App pages, based on frameworks such as React, Ember or Angular, it is worth considering adding a module or library that allows for any modification of these tags. In the case of applications built on React, the most frequently chosen library is React Router and React Helmet. The Gatsby framework based on React is also becoming more and more popular.

Testing and troubleshooting JavaScript application SEO

Positioning of Javascript-based websites has not been possible for many years. The development of methods to deliver content to web crawlers is in line with improving Google’s ability to render and read JS sites. However, there is still a significant risk of errors when indexing the content of our website – the solution provided by Google is not perfect.

The guarantee of the appearance in the SERPs is to allow access to the site by Google robots and control the content displayed by them. For this purpose, it is worth not to limit yourself to testing the website, especially with the help of the previously indicated Google tools.

Dynamic rendering – ready for Googlebot

As part of improving the relationship between Googlebot and JavaScript, Google in its documentation suggests using various tricks that allow for better processing of JS code. One of them is dynamic rendering.

Dynamic rendering is based on identifying the client (e.g. a browser or a web robot) and providing him with a response tailored to his technical capabilities. In practice, when the query is made by the user (web browser), the page will be rendered in the normal way – the HTML file will be downloaded and the desired content will be downloaded from the database with the help of a request sent by a JS script. When Googlebot asks for a given URL, the server will send a render of the page containing static HTML, which will enable faster indexing of its content.

An API called Rendertron can be used to implement dynamic rendering, which works as a standalone HTTP server. It renders the contents of URLs readable by bots that don’t execute JavaScript correctly. Rendertron allows you to save the rendered file in the cache, which significantly speeds up the sending of responses to the bot. The data in the cache will be updated automatically at intervals specified by us.

Pre-rendered documents are also useful from the point of view of other clients – they allow you to prepare content suitable for readers used by the visually impaired.

SEO and JavaScript – summary

The growing emphasis on the speed of serving content will certainly result in a further increase in the popularity of the JavaScript language. Google also takes this into account and is constantly working to improve the indexing of the content accessible with the help of JS. Appropriate optimization and the use of bot-friendly solutions are the key to high positions in search results, even for Single Page App and other JS-based sites.

Paraphrase Online Team

https://www.paraphrase-online.com

See author's posts