The publications are broken down on topics
Sketch based Approximate Query Processing
- Florin Rusu and Alin Dobra. "Sketching Sampled Data Streams", Proceedings of IEEE ICDE 2009, Shanghai, China, April 2009, pp.381-392. [Talk slides (pdf)].
- Florin Rusu and Alin Dobra. "Sketches for Size of Join Estimation", ACM Transactions on Database Systems, Vol. 33, No. 3, September 2008.
- Florin Rusu and Alin Dobra. "Pseudo-Random Number Generation for Sketch-Based Estimations", ACM Transactions on Database Systems, Vol. 32, No. 2, June 2007.
- Florin Rusu and Alin Dobra. "Statistical Analysis of Sketch Estimators", Proceedings of ACM SIGMOD 2007, Beijing, China, June 2007, pp. 187-198. [Talk slides (pdf)].
- Florin Rusu and Alin Dobra. "Fast Range-Summable Random Variables for Efficient Aggregate Estimation", Proceedings of ACM SIGMOD 2006, Chicago, Illinois, June 2006, pp. 193-204. [Talk slides (pdf)].
Histograms
- Lixia Chen, Alin Dobra: Histograms as statistical estimators for aggregate queries., Information Systems 38(2):213-230, 2013
- Alin Dobra: Histograms revisited: when are histograms the best approximation method for aggregates over joins?, PODS 2005: 228-237
Approximation of Aggregates in Sensor Networks
- Laukik Chitnis, Alin Dobra, Sanjay Ranka: Fault tolerant aggregation in heterogeneous sensor networks, Journal of Parallel and Distributed Computing 69(2):(Feb 2009) 210--219, 10
- Laukik Chitnis, Alin Dobra, Sanjay Ranka: Aggregation methods for large-scale sensor networks. IEEE(Institute of Electrical and Electronics Engineers) Transactions of Sensor Networks 4(2):(2008)
- Laukik Chitnis, Alin Dobra, Sanjay Ranka. Analyzing the multiple aggregation trees technique for fault tolerance in sensor networks. In proceedings of \emph{International Conference on Information Systems, Technology and Management} (ICISTM 2007), New Delhi, India, March 2007. pg. 269-279.
Analysis of Classification Methods based on Moment Analysis
- Amit Dhurandhar and Alin Dobra. Probabilistic Characterization of Nearest Neighbor Classifiers. Journal of Machine Learning and Cybernetics (IJMLC), 2012. PDF
- Amit Dhurandhar and Alin Dobra. Distribution free bounds for Relational Classification. Knowledge and Information Sytems 31(1):55-78 (2012) PDF
- Amit Dhurandhar and Alin Dobra. Semi-analytical Method for Analyzing Models and Model Selection Measures based on Moment Analysis. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 3, 2009. PDF
- Amit Dhurandhar and Alin Dobra. Probabilistic Characterization of Random Decision Trees. Journal of Machine Learning Research (JMLR), Vol. 9, 2008. PDF
- Amit Dhurandhar and Alin Dobra. Test Set Bounds for Relational Data that vary with Strength of Dependence. submitted PDF
- Amit Dhurandhar and Alin Dobra. Insights into Cross-validation. Technical Report PDF
- Amit Dhurandhar and Alin Dobra. Independent vs Collective Classification in Statistical Relational Learning. submitted PDF
- Amit Dhurandhar and Alin Dobra. Evaluating Evaluation Measures. Evaluation Methods in Machine Learning workshop in International Conference on Machine Learning (ICML), 2009.PDF
- Amit Dhurandhar and Alin Dobra. Study of Classification Algorithms using Moment Analysis. One of 2 regular papers accepted to New Challenges in Theoretical Machine Learning workshop in Neural Information Processing Systems (NIPS), 2008. PDF Talk link
Ph.D and Master Thesis
- Lixia Chen: Statistical approximations of database queries with confidence intervals 2011
- Florin Rusu: Sketches for aggregate estimations over data streams. 2009
- Amit Dhurandhar: Semi-analytical method for analyzing models and model selection measures. 2009
- Laukik Chitnis: Fault tolerance and scalability of data aggregation in sensor networks. 2008
- Guruditta Golani: Theory of linear operators for aggregate stream query processing. 2005