• HOME
  • IEEE Projects
    • IEEE Projects 2017 Dot Net Projects
    • IEEE Projects 2017 Java Projects
    • IEEE Projects 2016 Dot Net Projects
    • IEEE Projects 2016 Java Projects
    • IEEE Projects 2015
    • IEEE Projects 2014
      • IEEE 2014 Java Projects
        • IEEE Projects 2014 For Cse in Data Mining Java
        • IEEE Projects 2014 For Cse in cloud computing Java
        • IEEE Projects 2014 For Cse in Image Processing Java
        • IEEE Projects 2014 For Cse in Mobile Computing Java
        • IEEE Projects 2014 For Cse in Networking Java
        • IEEE Projects 2014 For Cse in Network Security Java
        • IEEE Projects 2014 For Cse in Software Engineering Java
      • IEEE 2014 Dotnet Projects
        • IEEE Projects 2014 For Cse in Data Mining Dotnet
        • IEEE Projects 2014 For Cse in Cloud Computing Dotnet
        • IEEE Projects 2014 For Cse in Netwoking Dotnet
        • IEEE Projects 2014 For Cse in Netwok Security Dotnet
    • IEEE Projects 2013
      • IEEE 2013 JAVA Projects
      • IEEE 2013 Dotnet Projects
    • IEEE Projects 2012
      • IEEE 2012 JAVA Projects
      • IEEE 2012 Dotnet Projects
    • IEEE Projects 2011
      • IEEE 2011 JAVA Projects
      • IEEE 2011 Dotnet Projects
    • IEEE Projects 2010
  • Power Electronics Projects
    • IEEE Projects 2015 For Power Electronics
    • IEEE Projects 2014 For Power Electronics
    • IEEE 2013 Power Electronics Projects
  • EMBEDDED Projects
    • IEEE Projects 2015 For Embedded Systems
    • IEEE 2013 Embedded Projects
  • Matlab Projects
    • IEEE 2013 Image Processing Projects
    • IEEE 2013 Power Electronics Projects
    • IEEE 2013 Communication Projects
  • NS2 Projects

Phd Projects | IEEE Project | IEEE Projects 2020-19 in Trichy & Chennai

IEEE Projects Trichy, Best IEEE Project Centre Chennai, Final Year Projects in Trichy - We Provide IEEE projects 2018 - 2019 , IEEE 2018 Java Projects for M.E/M.Tech, IEEE 2018 Dot net Projects for B.E/B.Tech, IEEE 2018 Power electronics Projects Engineering & Diploma Students, Matlab, Embedded, NS2 Projects
  • HOME
  • IEEE 2017 DOT NET PROJECT TITLES
  • IEEE 2017 JAVA PROJECT TITLES
  • CONTACT US
You are here: Home / ieee projects 2014 / FOCUS A SUPERVISED FORUM CRAWLER

FOCUS A SUPERVISED FORUM CRAWLER

June 17, 2014 by IeeeAdmin

Internet forums are important platforms where users can request and exchange information with others. For example, the Trip Advisor Travel Board is a place where people can ask and share travel tips. Due to the richness of information in forums, researchers are increasingly interested in mining knowledge from them. Tried to mine business intelligence from forum data. They proposed algorithms to extract expertise network in forums. Identified question and answer pairs in forum threads. According to an article from eMarketer – Where Are Social Media Marketers Seeing the Most Success? – forums are still part of the global social media strategy of the Top 500 Companies, and they are still getting really high marketing success with forums. To harvest knowledge from forums, their contents have to be downloaded first. Generic crawlers , which adopt a breadth first traversal strategy, are usually ineffective and inefficient for forum crawling. This is mainly due to two non-crawler-friendly characteristics of forums duplicate links & uninformative pages and page-flipping links. A forum usually has many duplicate links which point to a common page but with different URLs, e.g., shortcut links pointing to latest posts or URLs for user experience functions such as “view by title”. A generic crawler that blindly follows these links will trawl many duplicate pages that make it inefficient. A Forum typically has many uninformative pages such as login control to protect users’ privacy. Following these links, a crawler will trawl many uninformative pages. Though there are standard-based methods such as specifying the “rel” attribute with “nofollow” value, Robots Exclusion Standard, and Sitemap, for forum operators to instruct web crawlers on how to crawl a site effectively, we found that over a set of 9 test forums more than 47% of the pages trawled by a generic crawler following these protocols are duplicate or uninformative. This number is a little higher than the 40% that reported but both show the inefficiency of generic crawlers. Besides duplicate links & uninformative pages, a long forum board or thread is usually divided into multiple pages which are linked by page-flipping links, for example,. Generic crawlers process each page individually and ignore the relationship between such pages. These relationships should be preserved while crawling to facilitate downstream tasks such as page wrapping and content indexing. For example, multiple pages belonging to a thread should be concatenated together in order to extract all posts of this thread as well as the reply relationships between posts.

Filed Under: ieee projects 2014 Tagged With: Bulk IEEE Projects 2015, IEEE Projects 2015, IEEE Projects 2015 For BE Cse, IEEE Projects 2015 For Cse, ieee projects 2015 for it, IEEE Projects 2015 For MCA, IEEE Projects 2015 For ME Cse, ieee projects 2015 in data mining, java ieee projects 2015

Copyright © 2025 · News Pro Theme on Genesis Framework · WordPress · Log in