Search analytics and insights training - content auditor

The content auditor report is designed to assist you to understand and manage your content.

While the tool primarily focuses on your content’s metadata, it can also report on:

  • Readability of your content

  • Usage of undesirable words

  • When content was last updated, modified, published

  • Response times of your content

  • Discovery of duplicate content

The content auditor report can be filtered in a number of various ways through the recommendations, overview, attributes and search results screens.

Accessing the content auditor report

You can access the content auditor report by selecting the content auditor tile from the insights dashboard, or by clicking the content auditor link from the results page detail page in the administration dashboard.

Tutorial: Accessing content auditor reports

  1. From the insights dashboard select the foodista results page then click on the content auditor tile, or select content auditor from the sidebar menu.

  2. The content auditor - recommendations screen will then load.

    exercise accessing content auditor reports 01

Recommendations

The recommendation screen within the content auditor provides you with a number of different reports about the overall quality of your content.

Reading grade report

The reading grade chart reports on the reading grade level of pages in the index. The reading grade is assessed using the Flesch-Kincaid readability scale. This estimates the reading level required to understand the content from the user’s point of view. The grade level is the level of schooling - a grade of 8 corresponds to an 8th grade reading level, or that easily understood by a 13-14 year old in 8th grade.

reading grade chart 01

A reading grade level of 8-9 is considered plain English. For WCAG 2.0 accessibility compliance (Level AAA) the readability of your sites content should be secondary school level (grade 9) or lower. The pages audited for the chart above pass this AAA check.

Clicking on any of the bars in this report filters the recommendations data based upon that reading grade level. For example by clicking on the grade 5 level we get an analysis of that specific grade.

reading grade chart 02

Missing metadata report

The missing metadata table identifies pages that are missing metadata fields that are configured for reporting within the content auditor.

missing metadata 01

By default the content auditor reports on author, format, language, publisher and subject metadata.

Clicking one of the items filters the report to report on only items missing the selected field.

For the search in the example above the default tags are missing in all pages. An additional metadata field, tags, is also reported on and this is missing from 3581 pages on the site.

Duplicate titles report

The duplicate titles table lists titles that are found in more than one page. A duplicate title could indicate a duplicate content page, or a poorly titled page.

Duplicate titles should be avoided as these can cause confusion with users.

duplicate titles 01

This table can be used to identify the pages with duplicate titles so that a web administrator can take the appropriate action (which could be removing duplicate content or re-titling a page to provide better context.

Clicking one of the items filters the report to report on only items that contain the selected title. Clicking the view all button opens up the attributes report.

Date modified report

The date modified report presents a chart of when pages / documents were last modified. This is based upon the metadata of the page/document and can be helpful to identify which documents should be updated and/or reviewed.

date modified 01

Moving your mouse across each of the bars in the chart, will give you a glimpse of how many pages / documents have changed in a certain timeframe. Clicking on a bar filters the report to the selected year.

Response time report

The response time report provides you with a bar chart of the time taken to download documents. This may help identify pages / documents / sections or entire sites where response time is in need of improvement.

Note: the response time only tracks the time taken to retrieve the document and doesn’t include linked resources. It is not the same as a page load time which includes the time taken to load these resources (such as images presented in a HTML page).

Hovering over the bars in the chart provides additional information.

response time 01

Undesirable text report

The undesirable text table reports on undesirable words found within the content.

By default, the undesirable word list includes common misspellings but can be customised to identify organisational-specific undesirable words (such as specific words banned in editorial policies or other terms such as acronyms).

undesirable text 01

Clicking one of the items filters the report to report on only items that contain the specific word. Clicking the view all button opens up the attributes report.

Duplicate content

The duplicate content report shows documents for which the content (or if configured, some metadata) is duplicated by other documents. Duplicated content makes site more difficult to navigate, and may also be penalized as a ranking factor by some search engines.

Tutorial: Content auditor recommendations

  1. Log in to the dashboard and select the Foodista results page. Access the content auditor report by clicking on the content auditor sidebar item, or by clicking on the content auditor tile within the main dashboard display.

  2. Observe the different tables and charts that form the recommendations page - reading grade, missing metadata, date modified, duplicated titles and response times.

  3. Click on the bar with a value of 6 from the reading grade chart. The display changes to indicate that filters are applied for a reading grade level of 6. The reading grade chart disappears and the other tables and charts update with the filter applied. The filter will remain applied while navigating around the content auditor report until it is removed by clicking on the cross that appears next to the reading grade filter, or by clicking on the clear all filters button.

    exercise content auditor recommendations 01
  4. Click on tomatos from the undesirable text table. The display updates to indicate that the undesirable text filter is applied for the word tomatos. Observe that the reading grade filter is still applied. The report now displays information for pages with a reading grade of 6 that includes the undesirable text tomatos.

    exercise content auditor recommendations 02
  5. Clear the filters by clicking on the clear all filters button. The content auditor report reverts to the original state.

Content auditor overview

The overview tab reports on the top values found for each of the metadata fields covered by the content auditor report.

Metadata fields that are configured but missing in all the pages are suppressed.

Each metadata field can be explored further by clicking the corresponding view all button, or the report filtered to just the specific metadata value by clicking on one of the values.

content auditor overview 01

Content auditor attributes

The attributes tab provides a complete list of metadata values for each metadata field that is included in the content auditor report. Clicking on one of the values in the list will restrict subsequent reports to documents containing that metadata value.

content auditor attributes 01

Content auditor search results

The search results tab lists the individual pages that match the current search criteria which consists of any search terms entered into the search box filtered by any of the metadata values selected on other screens.

The results are returned as a table that shows the metadata values for each item along with some tools linking in with other parts of the insights dashboard.

The table lists:

  • Title and URL of the document / page

  • File size

  • Last updated date

  • Format

  • Metadata that is configured to be reported on

  • Quick access to additional tools

The listing can be exported as CSV.

Additional tools are available for each item in the listing providing quick access to the tool in the context of the selected page:

Symbol Name Function

Analyse anchor tags

Provides information about pages that link to the current page.

SEO auditor

Loads the page into SEO auditor, allowing for analysis on how the page performs for specific search terms.

Check accessibility with WCAG auditor

Information on how the page conforms to WCAG accessibility checks.

Preview the page / document

Shows a thumbnail sized preview of the page.

View cached copy

Loads the cached (or locally saved) copy of the page.

Tutorial: Metadata reporting

  1. Log in to the insights dashboard and select the Foodista results page. Access the content auditor report by clicking on the content auditor sidebar item, or by clicking on the content auditor tile within the main dashboard display.

  2. Click on the tab labelled overview. The metadata overview appears showing the top four values for date modified, duplicated titles, generator and tags. Tags is a custom metadata fields that have been configured for the Foodista search. Note: the other default metadata fields (author, subject, language etc.) are not shown because the pages on the Foodista site does not include any of this metadata.

    exercise metadata reporting 01
  3. Click on the view all link that appears below the tags table.

  4. Subdivide this report further to show only content modified in 2016 by selecting date modified from the sidebar, then 2016 from the attributes list.

  5. No further detail is available to drill down on in the date modified attribute. The individual pages from 2016 that have a tag can be viewed from the search results tab:

    exercise metadata reporting 02
  6. Within the search results tab, we can examine each matching result in more detail. Hovering over each result will trigger the appearance of additional icons:

    exercise metadata reporting 03
  7. For a given search result, trigger the anchors summary sub-report, showing which pages within a collection link to this one, and what link text is used.

    exercise metadata reporting 04
    exercise metadata reporting 05

    The anchors summary lists information about the links that reference the page that is being examined including the words used (which are counted towards the content for the document) as well as the type of link.

  8. For a given search result, trigger the accessibility auditor sub-report, producing advice on how to improve the page’s compliance with WCAG2:

    exercise metadata reporting 06
    exercise metadata reporting 07
  9. For a given search result, trigger the preview functionality, showing a thumbnail of how the page appears for a desktop-based browser by hovering over the eye icon:

    exercise metadata reporting 08
  10. For a given search result inspect the cached copy of the document by clicking on the history icon:

    exercise metadata reporting 09
    exercise metadata reporting 10

Tutorial: Export content auditor reports

  1. Export the report as a CSV file by clicking on the export CSV data button:

    exercise export content auditor reports 01
  2. A download should start automatically with the file saved to your default download location as content-auditor-export.csv.