Results 1 to 9 of 9

Thread: Seeking Gnutella Search Keyword Indexer

  1. #1

    ZeroPaid Regular

    Join Date
    Apr 2004
    Posts
    7

    Question Seeking Gnutella Search Keyword Indexer

    _________________


    I am seeking a gnutella tool that passively monitors searches on the network and saves
    the search keywords to an index file so the data can be manipulated. In other words, I
    want to be able to connect to the gnutella network, recieve search queries, and save the
    search queries to a list.

    I know some people have made modified gnutella apps that do this, but I don't know where
    to find them. Using the search engines nets no good results for such software because so
    many irrelevant site owners spam the index by using gnutella-related keywords in their
    META tags to draw higher search ranking.

    Anyone with information on where I can find such an app, please post links.

    Thank you.

    GnutellaSearcher

  2. #2

    ZeroPaid Regular

    Join Date
    Jan 2004
    Location
    Behind you
    Posts
    28
    In shareaza there its a search monitor.

  3. #3

    ZeroPaid Regular

    Join Date
    Apr 2004
    Posts
    7

    Question Shareaza Search Monitor Useless to Purpose

    Quote Originally Posted by URICHO_Corp.
    In shareaza there its a search monitor.
    Yes, I use Shareaza and I realize it has a search monitor, but that does not solve the problem. The Shareaza search monitor does not allow one to *save* the searches to a file for data mining purposes. There is no way to copy & paste search keywords or save them from Shareaza.

    I am seeking an application that does exactly what the Shareaza search monitor does, but also allows the list of searches to be automatically written to a database or flat text file.

    GnutellaSearcher

  4. #4
    Jorge's Avatar

    Zeropaid God

    Join Date
    Mar 2000
    Location
    San Diego, CA
    Posts
    3,309
    So you are trying to mine gnutella for data huh? Why don't you just write an app to do what you need it to do?

  5. #5

    ZeroPaid Regular

    Join Date
    Apr 2004
    Posts
    7
    Quote Originally Posted by Jorge
    So you are trying to mine gnutella for data huh? Why don't you just write an app to do what you need it to do?
    There are people who have already written such apps, the matter is just finding a download site. I do not have the time to learn a high-level programming language and the gnutella protocol. I'd rather just download one of the existing apps, which seems to be impossible to find with a search engine.

    GnutellaSearcher

  6. #6
    Sephiroth's Avatar

    ZeroPaid Regular

    Join Date
    Apr 2002
    Location
    Florida
    Posts
    2,788
    they are impossible to find because i doubt the spammers release the software they use, and so i dont think that there is program. Those who study the statistics of the network, look at more detailed statistics which determine the health of the network, what people are searching for isnt one of those things.

  7. #7

    ZeroPaid Regular

    Join Date
    Apr 2004
    Posts
    7
    Quote Originally Posted by Sephiroth
    they are impossible to find because i doubt the spammers release the software they use, and so i dont think that there is program. Those who study the statistics of the network, look at more detailed statistics which determine the health of the network, what people are searching for isnt one of those things.
    What does a gnutella query index have to do with spammers? I'm talking about collecting and categorizing search queries, not email addresses. I don't get what you mean about spammers not releasing the software they use, nor do I see the relevance to my question. Would you care to explain in detail?

  8. #8
    Sephiroth's Avatar

    ZeroPaid Regular

    Join Date
    Apr 2002
    Location
    Florida
    Posts
    2,788
    Quote Originally Posted by GnutellaSearcher
    What does a gnutella query index have to do with spammers? I'm talking about collecting and categorizing search queries, not email addresses. I don't get what you mean about spammers not releasing the software they use, nor do I see the relevance to my question. Would you care to explain in detail?
    No but they can then send fake results, that would show up as a spam to visit some web site, also in the past people have faked search replies and when people would download it and open it would be porno or a link to a some porn site or etc. This is not your typical email junk here.

    I dont get why you want to categorize and index what people are searching for in the first place. Alot of it would be junk, some servents like bearshare you wont be able to see anything their hosts are searching for if they have the secure channels on because it would be encrypted, you would get some duplicate queries and most of what youll get is from malicious hosts spamming the same searches over and over. Meaning that the data would not represent the actual users of the network but instead would be biased by malicious hosts and buggy servents which is another reason why its not done, that and again there is no practical reason to look at the content of the searches.

    Which is why i dont believe that one exists, anyways your best bet would be to ask the developers on some place like The GDF but unless you have a good reason to be trying to do this i dont think they will be much help.

  9. #9

    ZeroPaid Regular

    Join Date
    Apr 2004
    Posts
    7

    Question Thanks for the Insights

    Quote Originally Posted by Sephiroth
    No but they can then send fake results, that would show up as a spam to visit some web site, also in the past people have faked search replies and when people would download it and open it would be porno or a link to a some porn site or etc. This is not your typical email junk here.

    I dont get why you want to categorize and index what people are searching for in the first place. Alot of it would be junk, some servents like bearshare you wont be able to see anything their hosts are searching for if they have the secure channels on because it would be encrypted, you would get some duplicate queries and most of what youll get is from malicious hosts spamming the same searches over and over. Meaning that the data would not represent the actual users of the network but instead would be biased by malicious hosts and buggy servents which is another reason why its not done, that and again there is no practical reason to look at the content of the searches.

    Which is why i dont believe that one exists, anyways your best bet would be to ask the developers on some place like The GDF but unless you have a good reason to be trying to do this i dont think they will be much help.
    I appreciate the insights you have offered on this. What I see you saying is that a lot of the search pings on the network are generated by malicious hosts and "fake files" put there by people trying to draw attention to their porno sites? Like when I download an ebook entitled, "Magic Card Tricks" and opening it I find nothing but a link to some smut site or an online order form for David Copperfield videos? That is contiguous to spam, if that's what you're getting at and I have understood correctly.

    I am trying to do a search frequency index over a very long period of time to map a diagram and concordance of the "consciousness" behind the most popular searches, but taking into account what you have stated, I would need to have some intelligent manner to distinguish between good and bad results, which would require me to program my own application from scratch, with lots of sorting algorithms and intense AI built in. Argh!

    It sounds next to impossible to do, given the input you've posted. I'll need to think much about a different solution.

    I did find a news article that detailed the 30 most popular gnutella search keywords, most of them related to porn and pop icons. I was hoping to find something that would detail the most popular thousand keywords over a long period of time, at least several days. I am interested to know the subject matter that dominates the consciousness behind Gnutella searches. Until I figure out a better approach, I think perhaps I can search for articles written by others who have approached this subject and see if there are any reporting trends.

    Thanks for all your input. I'll try to remember to post here if I find out anything interesting in my search, God willing.

    GnutellaSearcher

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •