============================================== INSTALLATION INSTRUCTIONS FOR BINGO! Framework ============================================== This document contains a short installation guide for the BINGO! framework. It describes mainly the installation on Win32 platform. With minor changes, the same procedures can be used on other operating systems (e.g. Linux). 1) Prerequisites 2) Installation 3) Customization 4) Troubleshooting =========================================================== 1) Prerequisites To run the BINGO! framework, following installations are required: a) Java 2 JDK (e.g., SUN jdk 1.4 or higher) b) MySQL or Oracle database instance c) Adobe Acrobat and IFilter (only for PDF processing) d) Apache Tomcat 4.1 or higher (only for the component BingoReviser) e) Oracle JDBC driver package (only for Oracle databases) Important: to ensure that the desired JVM is in use, open the command prompt window and execute the command 'java -version'. If the returned version is incorrect or the command leads to an error, ensure that the PATH environment variable is properly set. =========================================================== 2) Installation a) Copy the whole content of this folder onto your hard disk. To avoid problems with external components, the path should NOT contain spaces. c:\tmp\bingo\ is always a good choice. b) Run schemaMySQL.bat (for MySQL databases) or schema.bat (for Oracle databases). This will create a new database account for BINGO!. In fact, it executes SQL scripts from the "schema" folder. In case of non-standard database configuration (user-defined tablespace names, etc.) you can customize these scripts manually. In case that you want to use the Oracle database, you will need to download the Oracle JDBC driver package. For further details, see 'customize.txt'. c) To start BINGO, run r.bat The current version contains some "batch" workflows (pre-defined sequences of standard operations in a proper order). You can access dem through "BINGO" menu entry of BingoDesktop: - Batch from Bookmarks: import and parse bookmark file, retrieve all documents, store elements into database, apply DF+MI feature selection, build SVM model(s), store the full BINGO model into database. - Batch from DB: read the - previously stored - model from database, initialize the crawler (register all visited URLs). - Crawl from DB: read the - previously stored - model from database, initialize the crawler (register all visited URLs) and fill the URL-Queue with some extracted but not yet crawled links from database. The Crawler can be started on - manually selected Links - prepared links from database (table start_urls) - some URLs from previous crawl (after "Crawl from DB"); To verify documents and feature spaces for each topic, you can use the collection of JSP servlets 'BingoReviser'. The simplest way to install 'BingoReviser' is to copy JSP Files and Java classes of its distribution into appropriate directories of an existing JSP repository (e.g., 'jsp-examples'). The root page of the Reviser is called 'bingo_feed_start.htm' and can be accessed in our example via http://hostname:8080/jsp-examples/bingo/bingo_feed_start.htm (depending on your custom settings, the port and the directory of this location may differ). =========================================================== 3) Customization See file customize.txt for details on BINGO! customization and recent modifications =========================================================== 4) Getting started: troubleshooting a) After start of BINGO!, I obtain error messages "Method not implemented" or "Method unknown". - verify that the proper JVM is in use. Background: when you install Oracle software, it automatically places its own JRE 1.3 into the PATH environment variable bevore 'your' entry. In this case, verify your PATH in Windows System settings and simply remove Oracle entries by hand. b) No database connection, error 'Adapter cannot establish the connection'. - Try to enter connection parameters by hand (rather than using pre-configured shortcuts). Use full host name of the database instance. Network timeouts can be also caused by restrictive firewalls. The shortcuts for database connections are located in plain text file 'bingo/data/accounts.dat' and can be modified by hand. c) All crawled PDF files are empty, although the PDF mime type is enabled in BINGO! settings. - verify that the PDF filter (Adobe IFilter) is properly installed. BINGO! contains a small test program 'bingo/crawler/handler/Filtdump.exe' that uses this filter and dumps the extracted content of the given PDF file onto screen. d) I try to re-create my database account by running 'schema.bat', but it produces a bunch of errors and does not drop my old account. - verify that all database connections for that user/database are closed. Active users that are connected to the database cannot be dropped. Don't forget to close BINGO!. e) I want to see the data in my database! - use the BINGO! MiniClient ('mini.bat' or BINGO Menue 'Database->MiniClient') to execute SQL commands and queries.