Skip to content

jvtoppa/YARI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is this code?

This is an implementation of the Re-Pair compression algorithm (or Byte-Pair encoding if you're a machine learn-ist), made for research purposes. Try running it on different data :)! Still a work in progress.

Class Diagram

classDiagram
    direction LR
    
    class Repair {
        -QUEUE q
        -unordered_map~pair, PAIR*~ ht
        -TSEQ seq
        -st rule
        -vector~st~ ruleHistory
        -firstPass()
        -compress(bool)
        +run(bool)
        +output()
    }

    class QUEUE {
        -vector~PAIRNODE*~ buckets
        +addPair(PAIR*)
        +removePair(PAIR*)
    }

    class PAIRNODE {
        +PAIR* p
        +PAIRNODE* next
        +PAIRNODE* prev
    }

    class TSEQ {
        <<THREADED SEQUENCE>>
        -vector~SEQ~ seq
        +next(st)
        +prev(st)
        +operator[](st)
    }

    class PAIR {
        +st left
        +st right
        +st freq
        +st f_pos
        +st b_pos
        +PAIR* next
        +PAIR* prev
        +PAIRNODE* node
    }

    class SEQ {
        +st code
        +st prev
        +st next
    }

    %% Relationships
    Repair *-- QUEUE : owns
    Repair *-- TSEQ: owns
    QUEUE o-- PAIRNODE : contains buckets (linked lists) of
    PAIRNODE o-- PAIR : points to data
    PAIR --> QUEUE : can be added to bucket
    TSEQ *-- SEQ : composed of
Loading

(This is all for now, I'll add more stuff later)

About

Yet Another Re-Pair Implementation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors