Data Science

Expertise in:

  • Inferential and Descriptive Statistics
  • Regression: Linear, Multiple, shrinkage methods, subset selection methods, ridge, lasso, Tree-based methods, etc.
  • Classification: Random Forests & tree-based methods, Bayes classifier, KNN, Logistic, LDA, QDA, etc.
  • Artificial Neural Networks using R and Python
  • Resampling methods (CV, LOOCV, Bootstrapping, etc.)
  • Unsupervised techniques: Clustering methods incl. K-Means, Hierarchical, etc.
  • Dimensionality reduction methods incl. PCA, etc.
  • Social network analysis: Influence maximization methods (Independent Cascade, Linear Threshold, Simulated based algorithms CELF, etc.)
  • Graph algorithms (BFS, k-way, Shortest Path, Spectral Bisection, etc.)
  • Centrality algorithms (Degree, Closeness, Betweenness, PageRank, etc.)
  • Community detection algorithms (Clustering Coefficient, Connected Components, etc.)
  • Time series analysis: AR, MA, ARIMA, SARIMA, Anomaly detection
  • Natural language processing: LSTM, Naive Bayes, Vectorization techniques, etc.
  • Non-Linear methods (splines, etc.), General Additive Models
  • Deep learning, Reinforcement learning
  • Other skills: data visualization (Tableau & Power BI), Databricks, Spark, Scala