"Did you see the new Linkbuilding Tool?, find 144 relevant link opportunities in just 2 steps"

Duplicate content checker

Duplicate content checker
9.1 (90.91%) 66 votes

Advanced options

(No more captcha? Register free!)

Join, SEO Review Tools

Over 26,043 already signed up for a free SEO Review Tools members account.

Create a free account login

Member login

Usage

Credits used: 0

0%

Duplicate content checker / Plagiarism detection.

Updates: 1. The duplicate content checker is now also able to process plain text input, besides URL input. 2. By clicking the advanced options box, you can select the option to search for duplicate content based on multiple data points (text selection). 3. And I tweaked the way the returned results are presented.

Use the duplicate content checker to find internal and external duplicate content for a specific webpage. Duplicate content is an important SEO issue, because search engines try to filter out as much duplicates as possible, to offer the best search experience. This tool is able to detect two types of (text based) duplicate content. Duplicate content types:

  1. Internal duplicate content. This means the same text is found on multiple pages on the same URL.
  2. External duplicate content. In this case the same text is found on multiple domains.

Why is it important to prevent duplicate content?

As mentioned above search engines don’t like duplicate content / plagiarism because users aren’t interested in looking at a search results page containing multiple URL’s, all containing more or less the same content. To prevent this from happening, search engines try to determine the original source, so they can show this URL for a relevant search query and filter out all the duplicates. As we know search engines do a pretty good job at filtering duplicates, but it is still pretty difficult to determine the original webpage. It can happen, when the same block of text appears on multiple websites, the algorithm will decide the page with the highest authority / highest trust will be shown in search results even though this isn’t the original source. In the case Google detects duplicate content with the intent to manipulate rankings or deceive users, Google will make ranking adjustments (Panda filter) or the site will be removed entirely from the Google index and search results.

How does the duplicate content checker work?

  • Find indexed duplicate content, using URL or TEXT input.
    • Use URL input to extract the main article content / text found in the body of a web page. Navigational elements are removed, to reduce noise (otherwise a lot of pages would be falsely identified as internal duplicates.)
    • Use text input to get more control over the input.
  • Select advanced options to choose one or multiple data points, used to detect duplicate pages. Selecting multiple data points, will get you more specific and even better matching results. (These data points are automatically extracted from the page content or text input).
  • Similar content is extracted, returned and marked as: Input URL, Internal duplicate, External duplicate.
  • Export the results to .CSV. and use Excel / Open Office spreadsheet to view, edit or report your results.

How to use these results?

Internal duplicates In most cases you’ll start solving internal duplicate issues. Because these problems exist in your own controlled environment (your website). Different methods can be used to remove internal duplicates, depending on the nature of the problem. Some examples:

  1. Minimize boilerplate repetition
  2. Use a 301 permanent redirect
  3. Use a canonical tag
  4. Use Parameter Handling in Google Webmaster Tools
  5. Prevent an URL from being index.

External duplicates External duplicates can be a whole nother story, because you can’t just make adjustments to your own site and solve the problem. Some examples how you can remove external duplicates:

  1. Contact webmasters, and ask them to remove the copies of your content.
  2. If an another site is duplicating your content / in violation of copyright law and contacting them doesn’t solve the problem, you can use this form to notify Google: https://support.google.com/legal/troubleshooter/1114905 .

Tool limitations

  1. This tools automatically extracts the text form a web page to use as input to detect duplicate content. This is not always the exact block of text you like to check for duplicates. In the case it’s better to use the text input field.
  2. New content needs to be indexed before it can be returned by this tool. If the page / content is less than 2 days old, chances are slim you will get any results.
  3. Not all duplicates, found online, are returned by this tool. But compared to other tools it returns a pretty large sum.

External recourses:

  1. Google, https://support.google.com/webmasters/answer/66359?hl=en
  2. Search Engine Land, http://searchengineland.com/library/google/google-panda-update
SEO Content Tools, SEO Tools

Upcoming online marketing events





20 thoughts on “Duplicate content checker

  1. Jhon e says:

    Nice work! Thanks for providing this great plagiarism tool. Especially because it returns and splits internal and external duplicates.

  2. Bobby Hanson says:

    I have an eCommerce site “chalktalksports.com”. It is on Sales Force Commerce Cloud (formerly Demandware). When we use the duplicate-content-checker, we are matched to many other eCommerce sites on the same platform. How do I find the code that is matching to make this connection?

    Not every eCommerce site on this platform has this problem, but many of us do.

    Thanks,
    Bobby

    1. Jasja ter Horst (admin) says:

      Hi Bobby,

      Just above the “Summary” after checking your URL you can see the Query (on the left side) . This is the text, the tool automatically extracts from your page to perform the duplicate content check.

      In your case this is the following text: “Your browser s Javascript functionality is turned off Please turn it on so that you can experience”. This is exactly why other e-commerce sites running on the same platform show up as duplicates. For the record these are not the type of duplicates which should worry you, since it’s just a very small content section.

      Because of this I would suggest using the text input http://www.seoreviewtools.com/duplicate-content-checker/?text-input to get an accurate duplicate content check.

      Success!

  3. Rasmus Jakobsen says:

    You should have the ability to check regular tekst, so that I dont have to create a page and publish it before checking it.

    1. Jasja ter Horst (admin) says:

      Hi Rasmus,

      You can use the text input http://www.seoreviewtools.com/duplicate-content-checker/?text-input to check unpublished text or when the tool isn’t able to do an automatic content extraction.

  4. Daimen says:

    Nice Tool!

    Great copyscape alternative :)

  5. Max K. says:

    “Sorry, problems connecting to the API, please try again tomorrow…”

    3 Days running for me. Anyone else running into this issue?

    1. Jasja ter Horst (admin) says:

      Hey Max,

      Thanks for mentioning
      Just solved the problem, so the tool should again, be working like a charm ;-)

  6. Jaswinder says:

    Absolutely great tool to check for plagiarism and duplicate content.

    Thanks

  7. jai says:

    Hi, i try to find the duplicate content for a link : http://www.eurocarparts.com/fr_fr/filtre-habitacle

    and it give 50 output, in that, it considered a internal link http://www.eurocarparts.com/fr_fr/informations-sur-la-livraison

    and the query is : Pour vous offrir la meilleure expérience possible, ce site utilise des cookies. Continuer à utiliser eurocarparts.com signifie que vous acceptez notre utilisation

    I try to find it out the query in both URLs but i can’t. Could you please guide me what is the issue ?

    1. Jasja ter Horst (admin) says:

      Hi Jai,

      You get this message because the tool detects this text as duplicate for multiple pages (50 in your example).
      The text you refer to is the default cookie text you showing on all your pages. This isn’t a problem, because it’s just a very small piece of content.
      To check other content sections try the “2 data point” query match or add text manually using the text input option you can find over here: http://www.seoreviewtools.com/duplicate-content-checker/?text-input
      Success!

  8. Jasja ter Horst (admin) says:

    Updates:

    • Expanded the number of API queries.
    • And did some small adjustments to improve the feedback provided by the tool.
  9. sahar says:

    Does your website use an independent testing procedure?
    Or is it supported by copy scape? Because you show other results.

    1. Jasja ter Horst (admin) says:

      Hi Sahar,

      Great question! The duplicate content checker uses it’s own technology. This automatically explains the differences you’ll encounter when comparing the results form this tool with for example Copy Scape.

      Cheers!

  10. Dilip Sharma says:

    I am testing My Blog (buzznix.com/) but it is showing “Sorry, problems connecting to the API, please try again tomorrow…”…what to do admin ? #help

    1. Jasja ter Horst (admin) says:

      Hi Dilip,

      API problem solved so you can use this tool again.

      1. Dilip Sharma says:

        Thank You :)

  11. Admin says:

    Update: Just fixed some API issues affecting this tool. Works like a charm again ; -)

  12. yuli agustiani says:

    Hi!! The Duplicate content checker is very helpful to me in improving the content on my blog.

  13. Admin says:

    Updated the “No results” response.

Leave a Reply

Your email address will not be published. Required fields are marked *

Seen on …

Numbers and stats

26,043

Registered Members

9.3 / 10

Avg. tool rating

185,000

Monthly visitors

45%

Returning visitors

SEO Chrome ExtensionGet the new Chrome Extension!

Tool Alert (!)

SEO Review Tools for Chrome is a free extension which allows you to quickly check the URL your visiting, with one of these (18) SEO tools:

SEO Review Tools for Chrome
Close