Publications

The bibliography (.bib) of my pubications with keywords annotated is available elsewhere.

Quick links: Preprints, Publications (peer-reviewed), Presentations

Preprints (work-in-progress)

PrivLogit: Efficient Privacy-preserving Logistic Regression by Tailoring Numerical Optimizers
W Xie, Y Wang, SM Boker, DE Brown
arXiv preprint , 2016 [PDF][Preprint]

*TL;DR* We pointed out a surprising performance lag in privacy-preserving logistic regression methods, and propose to tailor numerical optimizers to better suit cryptography. Our new protocol provides several times speedup over state-of-the-art cryptographic protocols.

Publications (peer reviewed)

A machine learning-based framework to identify type 2 diabetes through electronic health records
T Zheng, W Xie, L Xu, X He, Y Zhang, M You, G Yang, Y Chen
International Journal of Medical Informatics (IJMI) , 2017 [PDF][Supplement][Preprint]

*TL;DR* Manually phenotyping patient cohorts from medical records (EHR) using expert rules is still dominating genome- and phenome-wide association studies (GWAS, PheWAS). Our work instead proposes machine learning-based phenotyping approach, and comprehensively evaluates top classifiers against expert rules and yields superior performance. We validated our methods and findings using hundres of ground-truth labels reviewed by expert clinicians from a large EHR network in China.

Supporting Regularized Logistic Regression Privately and Efficiently
W Li, H Liu, P Yang, W Xie * (Corresponding author)
PloS ONE , 2016 [PDF][Preprint]

*TL;DR* We presented an efficient and secure method for privacy-preserving regularized logistic regression, building on secret sharing and distributed Newton method. Previous works did not protect model inference sufficiently or at all, and did not report any computational performance evaluations using real cryptographic implementation. Our work thus fills the gap.

SecureMA: protecting participant privacy in genetic association meta-analysis
W Xie, M Kantarcioglu, WS Bush, D Crawford, JC Denny, R Heatherly, BA Malin
Bioinformatics , 2014 [PDF][Supplementary][SecureMA code][CircuitService]

*TL;DR* We presented a privacy-preserving method and framework to perform genome-wide association studies (GWAS) for multi-center consortia, following the widely-used meta-analysis formulation. This approach conforms with common practice in human genetics research and seems more natural and computationally efficient/scalable than privacy-preserving distributed regression alternatives.

Inferring Clinical Workflow Efficiency via Electronic Medical Record Utilization
Y Chen, W Xie, CA Gunter, D Liebovitz, S Mehrotra, H Zhang, B Malin
American Medical Informatics Association (AMIA) Annual Symposium Proceedings , 2015 [PDF][Slides]

*TL;DR* We proposed a topic modeling (latent dirichlet allocation, or LDA) based method to model clinical workflow and care teams. This framework points out potential opportunities for workflow optimization and better resource allocation.

A novel transfer learning method based on common space mapping and weighted domain matching
RZ Liang, W Xie, W Li, H Wang, JJY Wang, L Taylor
Proceedings IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI) , 2016 [PDF][Preprint]

*TL;DR* We proposed a new method for transfer learning -- useful for limited labeled data regime.

Presentations

PrivLogit: Fast Privacy-preserving Logistic Regression via Tailored Numerical Optimizer

Invited talk, International Conference on Design of Experiments (ICODOE), Memphis, TN , 2016

Privacy Leaks in Meta-analysis Quality Control and Countermeasures

Contributed, American Society of Human Genetics (ASHG) 65th Annual Meeting, Baltimore, MD , 2016