Skip to content

poliveira89/jetys

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

JETYS - Java tExT indexer sYStem

Description

This project is the accumulation of several objectives and challenges to implement what is intended to be an Information Retrieval system, more precisely an engine indexing and search of unstructured information. Initially it will be a system for indexing information and data contained in text files.

Features

  • Read a Corpus
  • Tokening
  • Stemming
  • Filter by Stop Words
  • Indexing by flexible Rules defined in the code

Motivation

The main motivation for this project is to develop a desire to learn, plan and overcome new challenges, this case study and apply the concepts and philosophies acquired in the field of Information Retrieval.

TODO

  • Resolve problems that involve Regex Pattern involving hyphens between char and numbers
  • Query Module
  • Perform Information Retrieval calculations and tests
  • Unit Testing
  • Error handling
  • Documentation / Wiki
  • Increase Performance
  • Generate a library from this code

Contributions

Thanks to Daniel Santos for helping with several contributions on the code.

About

Java tExT indexer sYStem

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages