Strigi's features

General Features

  • very fast crawling
  • very small memory footprint
  • no hammering of the system
  • portable: can run on different operative systems (actually Gnu/Linux, Solaris, Macos X and W indows)
  • pluggable backend: currently clucene and hyperestraier, sqlite3 and xapian are in the works
  • DBus and socket interfaces are available for communication between daemon and search programs.
  • simple interface for implementing plugins for extracting information. We'll try to reuse the kat plugins, although native plugins will have a large speed advantage
  • calculation of sha1 for every file crawled (allows fast finding of duplicates)
  • automatically detects file system updates: in this way Strigi's index will be always syncronized with your file system contents (feature still experimental)
  • inotify support: for keeping index up-to-date with your file system
    contents (warning this is still under development)

Supported file types

Strigi is able to index the contents of the following file types:

Plain text
Pdf
Archive files
Mp3
OASIS spreadsheet
OASIS text file
OASIS presentation
Debian package
rpm package

JStreams

JStreams are the "heart" of Strigi. Their aim is to provide a standardized interface for accessing the contents of different file types.