Envision, Create, Share

Welcome to HBGames, a leading amateur game development forum and Discord server. All are welcome, and amongst our ranks you will find experts in their field from all aspects of video game design and development.

Web Crawlers, bots, etc

I have searched around looking for information on web crawlers and how to make them, but not really any for the purpose I want. I am looking to create a web crawler that searches through a target websites files and will do I guess some kind of web search to look for plagiarised work within other websites.

I am aware that this is probably a broad search as well as knowing that crawlers can take a while to make or can be hard to make, but I am hoping maybe someone could send me in the right direction in creating maybe a simple/complex web crawler that would do a search from one URL and look into other sites to find anything that could have been plagiarised and return the URL and possibly part (if not all) of the text that was taken.

As I said, I know this is probably not an easy task, and I have no knowledge of making a web crawler to begin with. However I am majoring in computer science and do have some programming knowledge, so if anyone could please direct me the right way to maybe a tutorial or good information on how to start/make one that would be awesome.

Thanks for the help
 
Not sure how much help I'll be but here's a good reverse image search you could use to compare the images to find copies.
They also have an API, which can do alot of useful things you'll need to make your bot, like "content identification," and "image tracking."
http://www.tineye.com/commercial_api

Only problem is it requires a yearly subscription called a "search bundle"
Still, the site's worth a look.

Sorry I cant help more,
-action man
 

Thank you for viewing

HBGames is a leading amateur video game development forum and Discord server open to all ability levels. Feel free to have a nosey around!

Discord

Join our growing and active Discord server to discuss all aspects of game making in a relaxed environment. Join Us

Content

  • Our Games
  • Games in Development
  • Emoji by Twemoji.
    Top