Skip to content

Data engineering #4

Description

@justheuristic
  • Basic parallelism. Workload division. MP Queue. Native threading Vs Joblib. Threads Vs Processes.
    Writing your own parallel image labeling system in a nutshell.
  • ссылка на курс Саши Петрова?
  • Machine learning in case of large datasets. Stochastic gradient descent tweaks. Progressive validation. Parameter-server architectures. Самописная линейная модель во внешней памяти.
  • Vowpal Wabbit . Song age prediction from EDX or whatever.
  • Word count with Spark
  • CTR prediction with Spark

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions