The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.

    10,000 PDFs : need management software AND full text search

    Discussion in 'Windows OS and Software' started by Rad Gravity, May 13, 2015.

  1. Rad Gravity

    Rad Gravity Notebook Enthusiast

    Reputations:
    0
    Messages:
    15
    Likes Received:
    1
    Trophy Points:
    6
    Over 10,000 PDFs that need to be displayed in a format similar to Calibre. Calibre would be perfect if it offered FULL TEXT search capabilities. Currently, items in Calibre are only searchable by what is manually typed into tags. What a shame.

    Windows 8 Search isn't good enough, either. The search results do not give enough information about where the search term is located, frequency, or a few sentences containing the search term.
     
  2. TreeTops Ranch

    TreeTops Ranch Notebook Deity

    Reputations:
    330
    Messages:
    904
    Likes Received:
    124
    Trophy Points:
    56
    There is software that can create searchable PDF's. I use Nuance Power PDF Advanced. It also has automatic features that can search a directory and do the files in that directory. Adobe's Acrobat may also create batch searchable PDF's.
     
  3. Pirx

    Pirx Notebook Virtuoso

    Reputations:
    3,001
    Messages:
    3,005
    Likes Received:
    416
    Trophy Points:
    151
    Do you think X1 Search might fit your bill?
     
  4. Mr.Koala

    Mr.Koala Notebook Virtuoso

    Reputations:
    568
    Messages:
    2,307
    Likes Received:
    566
    Trophy Points:
    131
    Mendeley does full text search reasonable well. Local usage is free. Online sync is paid over 2GB. Win/Mac/Linux supported. There's also iOS and will have Android soon.

    However, you need to keep in mind that not all PDF text is plain-text extractable and the hardcoded layout often gets in the way. If you sentence is broken into two lines or blocks/pages you would have a smaller chance hitting it.

    One problem with Mendeley is the annotation system, which is very limited and saves in a non-standard format.
     
    Last edited: May 18, 2015
  5. TreeTops Ranch

    TreeTops Ranch Notebook Deity

    Reputations:
    330
    Messages:
    904
    Likes Received:
    124
    Trophy Points:
    56
    Wonder if the OP was scamming us. He hasn't returned to make any comments about our suggestions. And 10,000 PDF's? That in itself seems odd. Not enough info to explain that either.
     
  6. Mr.Koala

    Mr.Koala Notebook Virtuoso

    Reputations:
    568
    Messages:
    2,307
    Likes Received:
    566
    Trophy Points:
    131
    10,000 PDF is nothing uncommon. I know people who manage that many book PDFs.
     
  7. TreeTops Ranch

    TreeTops Ranch Notebook Deity

    Reputations:
    330
    Messages:
    904
    Likes Received:
    124
    Trophy Points:
    56
    You got one up on me then. I just can't imagine reworking that many PDF's so that they are searchable. Even if you could do that in batch mode it would take hours, maybe days. If they are book PDF's then the chore is even harder. If average book is 100 pages then you will have 1 million pages to do. Impracticable.
     
  8. Mr.Koala

    Mr.Koala Notebook Virtuoso

    Reputations:
    568
    Messages:
    2,307
    Likes Received:
    566
    Trophy Points:
    131
    They take tons of notes, only search text among those referenced in the notes on the specific topic and of course isolate entries into categories.

    Global full text search is not practical at all.
     
    Last edited: May 29, 2015
  9. Rad Gravity

    Rad Gravity Notebook Enthusiast

    Reputations:
    0
    Messages:
    15
    Likes Received:
    1
    Trophy Points:
    6
    Hey, sorry -- been very busy traveling lately.

    The PDFs are books and academic articles. Most are over 200 pages and hence the need for a good search capability.


    X1 Search looks good. Will download the trial, thanks!