The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.

    Need a program that can compile lots of word and image files into one for easy searching

    Discussion in 'Windows OS and Software' started by NumLock, May 29, 2010.

  1. NumLock

    NumLock Notebook Evangelist

    Reputations:
    38
    Messages:
    367
    Likes Received:
    1
    Trophy Points:
    31
    Hi,

    Just wanted to put this up to ask if anyone know of a program where in I can copy paste a lot of formatted word files and images in just one file with the key goal of faster searching of text.

    It may be an offline software solution like PDFs and CHMs or a content management system (CMS) but I would prefer the former type as it would more portable.

    the data:
    Its really a lot of data... a court case. a lot of transcripts and affidavits, etc. which I have no clue in since I don't do law; its just that I'm doing this for my uncle. This data came from an old hardcopy of the whole case in which a 400MB combined word and image files was actually typed/scanned.

    What I have tried:
    - CHM - tried HelpScribble but it was very limited and I had to manually define indexes.
    - PDF - similar to CHM; but placing 250MB worth of images into one pdf file could have a drastic tax on any computer...

    Might anyone know of a possible solution or provide a technology which I can try to research on o solve my needs.

    Right now if I can't find anything; then I might end up just creating a CMS or an offline forum with good index searching.
     
  2. Joel

    Joel coffeecoffeecoffeecoffee

    Reputations:
    1,059
    Messages:
    1,663
    Likes Received:
    0
    Trophy Points:
    55
  3. jackluo923

    jackluo923 Notebook Virtuoso

    Reputations:
    1,038
    Messages:
    3,071
    Likes Received:
    1
    Trophy Points:
    105
    Microsoft One Note
     
  4. kosti

    kosti Notebook Virtuoso

    Reputations:
    596
    Messages:
    2,162
    Likes Received:
    466
    Trophy Points:
    101
    Sounds like you need a free-form database. I use Ultra Recall from Kinook Software.
     
  5. Joel

    Joel coffeecoffeecoffeecoffee

    Reputations:
    1,059
    Messages:
    1,663
    Likes Received:
    0
    Trophy Points:
    55
  6. gerryf19

    gerryf19 I am the walrus

    Reputations:
    2,275
    Messages:
    3,990
    Likes Received:
    0
    Trophy Points:
    105
    This will not help with the pictures, but a grep program can search through many files with text.

    So, for example, download wingrep, place all word files in one folder, then tell wingrep to search for whichever expression (word, phrase) you want. It will search the entire folder very quickly.
     
  7. newsposter

    newsposter Notebook Virtuoso

    Reputations:
    801
    Messages:
    3,881
    Likes Received:
    0
    Trophy Points:
    105
    depends on how the user wants to tag their images. just about every reasonable image format has a variant with exif tags. populate those tag fields and they can be indexed/searched.
     
  8. NumLock

    NumLock Notebook Evangelist

    Reputations:
    38
    Messages:
    367
    Likes Received:
    1
    Trophy Points:
    31
    Thanks; I've tried an offline forum; I used a portable xampp web server. It took me 2 hours to set things up. Everything was easy C&Ps word content into the posts using the WYSIWYG editor conserving the format automatically.

    Everything was going great until I encountered a major blockage: the forum did not support an ordered list within an ordered list, and so on. :(

    I will try out ultra recall now. it looks very promising.

    As for the pictures; only the titles are needed (pictures are scanned copies of balance sheets, etc.). This is the hardest part for me as I will be typing out all the titles one by one.