Stream Data Analytics and Machine Learning Laboratory at the NSU Department of Mechanics and Mathematics was established in December 2015.

Laboratory Objectives:

  • Use modern deep learning methods to create technologies and technical solutions for video surveillance, speech recognition, social networks monitoring , and person identification;
  • Develop new high-speed methods from 100 Gbit / sec with intelligent processing of network traffic using GPU and FPGA;
  • Develop applied methods of big data analysis in the field of natural language processing, speech recognition, image recognition;
  • Create a quantum statistical theory of measurements for machine learning;
  • Development of FRiS methodology for data analysis and its technical implementation.

Laboratory Projects:

  • Development of analytical tools for streaming data processing. Customer - Signatek LLC.
  • Master's Program “Big Data Analytics” (in English).
  • Continuing education course “Big Data Analytics for Business” in partnership with ExpaSoft LLC. 2015-2017 50 specialists trained.
  • Host IEEE Siberian Symposium on Data Science and Engineering (SSDSE) international conference on April 14, 2016 under the auspices of IEEE.
  • The NTI Big Data Center at NSU.

Laboratory Achievments:

  • video summary;
  • speaker identification;
  • Recognition of standard state documents (Pavlovsky E.N., Luppov D.A., Zyryanov A.O., Alyamkin S.A. Graphic and video information analysis module: classifier of document forms. // Certificate of state registration of computer program No. 2017612227 from 02.17.2017.);
  • Solution of coreference;
  • News bulletin classifier (Pavlovsky E.N., Maslovsky I.A., Batura T.V., Dyubanov V.V. Text information analysis module: classifier of texts. // Certificate on state registration of computer programs No. 2017611829 of 09.02. 2017.);
  • Collection of dossiers from open sources.

Laboratory Equipment:

  • NVIDIA DIGITS deep learning server (processor: 1 × Intel® Xeon® CPU E5-2699v3 45Mb Cache 18C / 36T / 2.30Ghz; coprocessor: 4x12GB NVIDIA GeForce TITAN X; RAM: 128GB (16 × 8) 2133 MHz ECC REG) ;
  • 2 workstations for training neural networks (processor: 1 × CPU Intel Core i7 6700K (4C, 3.4GHz, 2400MHz, 8MB, 140W) or 1 × CPU Intel Xeon E5-1630 v3 (4C, 3.7GHz, 3.8GHz Turbo, 10MB , 140W); coprocessor: 12 GB NVIDIA GeForce TITAN X; RAM: 32 GB (2 × 16) 2133 MHz ECC REG);
  • 2 workstations for working with network traffic (processor: 1 × CPU Intel Xeon E5-1620 v4 (4C, 3.5GHz, 3.8GHz Turbo, 2400MHz, 10MB, 140W); coprocessor: 8 GB NVIDIA Quadro M4000; RAM: 32 GB (4 × 8) 2133 MHz ECC REG; network adapter: 40GbE.Mellanox ConnectX-4 Lx EN);
  • server for data storage (processor: 1 × CPU Intel® Xeon® E5-2620v2 15Mb Cache 6C / 12T / 2.10Ghz; storage: 2x2TB SATA3 Caviar Green 64Mb; RAM: 32 GB (16 × 2) 2133 MHz ECC REG; );
  • server for processing network data (dual-processor system: 2 × CPU Intel® Xeon® E5-2698v4 50Mb Cache 20C / 40T / 2.20Ghz; coprocessor: 12GB NVIDIA Tesla K40; RAM: 128GB (16 × 8) 2133 MHz ECC REG; 80GbE network adapters: 2 × 40 GbE Mellanox InnovaFlex-4 Lx (with programmable FPGA); network accelerator: 40GbE (4 × 10GbE) Napatech NAC NT-40-E3-4-PTP).

Laboratory Partners:
Aigents
Sobolev Institute of Mathematics
A.P. Ershov Institute of Informatics Systems
Algorithmics Laboratory
Applied Probability Laboratory

Laboratory site