Skip to content

Special normalisation rules #37

@amessina71

Description

@amessina71

How to deal with:

  1. numbers
  2. acronyms / symbols
  3. website / email spellings (e.g. use "dot", "at" )

I would go for trying as much as possible to have letter-based normalised representations of all the above such as:

  1. 100 -> one hundred (cento, cent)
  2. Hz -> (hertz), WHO -> double u aitch o (less sure about this one ...)
  3. www.rai.it -> vu vu vu punto rai punto it, pippo@pluto.com -> pippo at pluto dot com

of course this would be for the sake of comparison, no one would really like to have such transcripts as a final product ... we don't even need to output normalised text if not for a debug session.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions