README.rdoc
{<img src="https://codeclimate.com/github/CV-Gate/search_for_text_into_pdfs.png" />}[https://codeclimate.com/github/CV-Gate/search_for_text_into_pdfs]
== Search for text into PDF documents. App example.
This application is a simple example about how a text search can be done into pdf documents. It works with Sphinx and the pdf-reader gem.
== Getting Started
1. Install Sphinx
2. Into the app configure the database connection (Sphinx only will work with MySQL or PostgreSQL)
3. Execute rails s and upload some PDFs
4. Run <code>rake ts:index</code> and <code>rake ts:start</code>
5. Run <code>whenever --update-crontab pdf_index</code> to start the cron job that reindex the records
6. You can also configure the cron job in Rails, now it's working each minute for testing purposes
== Limitations
The app stores texts into DB. The limit for MySQL is 4294967295 characters, so biggest PDFs will trim while storing. It's also possible that the DB server will throw a time-out.
== Todo
- Validate texts on size (perhaps too expensive)
- Write some tests
- Some refactor