Analyttica’s Strengths
People
Analyttica’s team comprising of AI & NLP SME’s is well positioned to drive impact for clients by contextualizing open-source Gen AI applications on business data in a white-box approach
Product
Deep technology platform and a powerful AI engine facilities faster build time, ingestion of best-in-class open-source and custom-built algorithms and ease in solution deployment, with transparent collaboration among teams.
Analyttica’s Generative AI Capabilities
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.
Data Engineering Process - Capabilities
Process Stage | Purpose | Tools / Technologies | Alternate Tools/Technologies |
---|---|---|---|
Data Ingestion | Stream Processing Batch Processing |
Connectors | Apache Kafka / Flume / Sqoop |
Data Storage/Warehousing | Data (Schema/No-schema) Aggregation and Storing | AWS S3, Snowflake | Google Cloud Storage Relational databases (PostgreSQL, MySQL, MS SQL, etc.) Data warehousing solutions (Amazon Redshift, Google BigQuery) |
Data Transformation | Data transformation as per problem | DBT Python (Jupyter) |
Apache Beam AWS Glue DataBricks |
Data Visualisation & Reporting | Interactive Dashboards and Visualization | Tableau Looker Power BI |
Google Data Studio Mapbox, Carto |
Data Sharing | Reporting | Confluence | Word, PDF |
ML Ops Process - Capabilities
Process Stage | Purpose | Tools | Alternate Tools |
---|---|---|---|
Data Analysis and Preprocessing | Data cleaning Exploratory data analysis (EDA) |
DBT | Databricks Google Data Studio |
Experimentation and Modelling | Model training Hyperparameter tuning |
MLFlow (for experiment tracking) Tensorflow, Keras, Scikit-learn, Pyspark, PyTorch, Spacy, OpenAI, Langchain, Cohere, Pinecone |
TensorBoard Neptune.ai WandB |
Feature Store | Feature versioning Feature weights storage from models |
Feast | Tecton+A40 Hopsworks |
Code Repository | Version control Code review |
Bitbucket | Git GitHub/GitLab |
ML Pipeline | Automated model deployment | Kubeflow Pipelines | Apache Airflow Jenkins CircleCI |
Metadata Store | Store experiment metadata Track model versions |
Kubeflow Metadata | MLFlow Weights & Biases Artifacts |
Model Registry | Model versioning Model lineage tracking |
MLFlow Model Registry | DVC |
Model Serving | Model serving API endpoint creation |
Kubeflow KFServing | TensorFlow Serving Seldon NVIDIA Triton |
Model Monitoring | Model performance tracking Data drift detection |
Evidently | Grafana Prometheus |
Data Engineering & ML Ops Process - Programming Capabilities
Below are the list of Programming languages along with their usage in Data Engineering & MLOps
Programming Language | Usage in Data Engineering & MLOPs |
---|---|
Python | Python due to its extensive libraries like TensorFlow, PyTorch, scikit-learn, and Pandas. Every stage from data processing, modelling, serving, to monitoring can involve Python. |
Java | Java is used in big data technologies (like Apache Kafka, Apache Hadoop) and is used for model deployment, especially when integrating with enterprise systems. |
Scala | Scala runs JVM (Java Virtual Machine), and Scala based framework (Apache Spark)commonly used in data engineering tasks within MLOps. |
Go (Galang) | Golang is used for efficiency and scalability, widely used in Kubernetes which is essential for orchestrating containers in MLOps for model deployment and scaling. |
Javascript | JavaScript is used for web applications, When serving models via web applications or developing monitoring dashboards, JavaScript and its frameworks (like React, Angular) can be utilized. TensorFlow.js allows machine learning directly in the browser or on Node.js. |
SQL | SQL is widely used for tasks like feature extraction, data aggregation, and data validation, SQL can be essential, especially when interfacing with relational databases. |
Shell Scripting | Shell Scripting is used for automating deployment pipelines, data fetching, and other operational tasks often require shell scripts. |
YAML | YAML is crucial in MLOps for defining configurations, especially in tools like Kubernetes and CI/CD pipelines. |
Python
Python due to its extensive libraries like TensorFlow, PyTorch, scikit-learn, and Pandas. Every stage from data processing, modelling, serving, to monitoring can involve Python.
Java
Java is used in big data technologies (like Apache Kafka, Apache Hadoop) and is used for model deployment, especially when integrating with enterprise systems.
Scala
Scala runs JVM (Java Virtual Machine), and Scala based framework (Apache Spark)commonly used in data engineering tasks within MLOps
Go (Golang)
Golang is used for efficiency and scalability, widely used in Kubernetes which is essential for orchestrating containers in MLOps for model deployment and scaling
JavaScript
JavaScript is used for web applications, When serving models via web applications or developing monitoring dashboards, JavaScript and its frameworks (like React, Angular) can be utilized. TensorFlow.js allows machine learning directly in the browser or on Node.js.
SQL
SQL is widely used for tasks like feature extraction, data aggregation, and data validation, SQL can be essential, especially when interfacing with relational databases.
Shell Scripting
Shell Scripting is used for automating deployment pipelines, data fetching, and other operational tasks often require shell scripts.
YAML
YAML is crucial in MLOps for defining configurations, especially in tools like Kubernetes and CI/CD pipelines.
Accuracy Enhancements in Resolution Step Identification Solution
- Unique Resolution Step (Text data based) Identification are resolutions for customers queries
- Developed an AI/ML solution to predict
the resolution step to be recommended to the customers by support reps - Improve solution accuracy compared on actual recommendation given by rep
- Implementation of latest Gen AI algorithms (Massive Text Embedding Benchmark (MTEB) Leaderboard)
- White-box approach to drive incremental
results - Leverage of Analyttica’s LEAPS platform for solution development & deployment
- Built solutions using state-of-the-art LLM algorithms like Sentence Transformer and Instructor Large Models
- Solution predicts URI from a pool of 9,000 URIs with accuracy of ~60%
Accuracy Enhancement in Support Ticket Classification Solution
- Predict the category of tickets generated (text data based)
- Improve Ticket Category solution accuracy
- Continuous experimentation, to build layered solution that has multiple Machine Learning and Semantic/Keyword layers
- Implementation of latest Gen AI algorithms (Massive Text Embedding Benchmark (MTEB) Leaderboard)
- White-box approach to drive incremental results
- Leverage of Analyttica’s LEAPS platform solution development & deployment
- The solution covers 98% of ticket volume through prediction of 34 Ticket Categories
- Solution Accuracy is 75%
- Uses state-of-the-art LLM algorithms for encoding and decoding
Empowering Business Growth Through Advanced Analytics & AI Solutions
Impact Case: ML Application & JSON Data Parsing
Business Objective
To drive better customer experience through
faster resolution cycle by predicting the category of user raised queries which enables query assignment to relevant
resolution team.
Data Engineering Objective
To support and maintain continuity of
client-side data pipeline by integrating ML
based custom solution using JSON based APIs supporting both real-time and batch mode requests
Impact Case: Data Engineering Overview for a Growth Phase Fintech
Business Objective
To drive overall growth via expanding product mix, new customer acquisitions and improving cross sells to existing customer
Data Engineering Objective
To support the growth objective by continuously evolving data infrastructure
To enhance analytics capabilities, speed and accuracy of data-driven decisioning
Management Team & Key Talent Profiles
Ankit Mahna
AVP - Analytics & AI
Exp: 10+ Years
Education: MBA - IIM
Vishwanath Paramashetti
AVP - Data Science
Exp: 10+ Years
Education: BE
Varadharajan Sridharan
Principal Data Scientist
Exp: 6+ Years
Education: PG - Data Science IIT
Arkadeep Banerjee
Principal Analyst
Exp: 7+ Years
Education: Btech