Xunpeng Huang | publications

2024

NeurIPS
Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference

Huang, Xunpeng, Zou, Difan, Dong, Hanze, Zhang, Yi, Ma, Yian, and Zhang, Tong

In Annual Conference on Neural Information Processing Systems (NeurIPS) 2024, Best paper of ICML Workshop on Structured Probabilistic Inference & Generative Modeling 2024

Abs Bib HTML

To generate data from trained diffusion models, most inference algorithms, such as DDPM, DDIM, and other variants, rely on discretizing the reverse SDEs or their equivalent ODEs. In this paper, we view such approaches as decomposing the entire denoising diffusion process into several segments, each corresponding to a reverse transition kernel (RTK) sampling subproblem. Specifically, DDPM uses a Gaussian approximation for the RTK, resulting in low per-subproblem complexity but requiring a large number of segments (i.e., subproblems), which is conjectured to be inefficient. To address this, we develop a general RTK framework that enables a more balanced subproblem decomposition, resulting in \tildeO(1) subproblems, each with strongly log-concave targets. We then propose leveraging two fast sampling algorithms, the Metropolis-Adjusted Langevin Algorithm (MALA) and Underdamped Langevin Dynamics (ULD), for solving these strongly log-concave subproblems. This gives rise to the RTK-MALA and RTK-ULD algorithms for diffusion inference. In theory, we further develop the convergence guarantees for RTK-MALA and RTK-ULD in total variation (TV) distance: RTK-ULD can achieve εtarget error within \tildeO(d^1/2ε^-1) under mild conditions, and RTK-MALA enjoys a O(d^2\log(d/ε)) convergence rate under slightly stricter conditions. These theoretical results surpass the state-of-the-art convergence rates for diffusion inference and are well supported by numerical experiments.
@inproceedings{huang2024reverse, abbr = {NeurIPS}, bibtex_show = {true}, title = {Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference}, author = {Huang, Xunpeng and Zou, Difan and Dong, Hanze and Zhang, Yi and Ma, Yian and Zhang, Tong}, booktitle = {Annual Conference on Neural Information Processing Systems (NeurIPS) 2024, Best paper of ICML Workshop on Structured Probabilistic Inference & Generative Modeling}, year = {2024}, html = {https://arxiv.org/abs/2401.06325}, selected = {true} }
COLT
Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo

Huang, Xunpeng, Zou, Difan, Dong, Hanze, Ma, Yian, and Zhang, Tong

In Conference on Learning Theory (COLT) 2024

Abs Bib HTML

To sample from a general target distribution p∗∝e−f∗ beyond the isoperimetric condition, Huang et al. (2023) proposed to perform sampling through reverse diffusion, giving rise to Diffusion-based Monte Carlo (DMC). Specifically, DMC follows the reverse SDE of a diffusion process that transforms the target distribution to the standard Gaussian, utilizing a non-parametric score estimation. However, the original DMC algorithm encountered high gradient complexity, resulting in an exponential dependency on the error tolerance ϵ of the obtained samples. In this paper, we demonstrate that the high complexity of DMC originates from its redundant design of score estimation, and proposed a more efficient algorithm, called RS-DMC, based on a novel recursive score estimation method. In particular, we first divide the entire diffusion process into multiple segments and then formulate the score estimation step (at any time step) as a series of interconnected mean estimation and sampling subproblems accordingly, which are correlated in a recursive manner. Importantly, we show that with a proper design of the segment decomposition, all sampling subproblems will only need to tackle a strongly log-concave distribution, which can be very efficient to solve using the Langevin-based samplers with a provably rapid convergence rate. As a result, we prove that the gradient complexity of RS-DMC only has a quasi-polynomial dependency on ϵ, which significantly improves exponential gradient complexity in Huang et al. (2023). Furthermore, under commonly used dissipative conditions, our algorithm is provably much faster than the popular Langevin-based algorithms. Our algorithm design and theoretical framework illuminate a novel direction for addressing sampling problems, which could be of broader applicability in the community.
@inproceedings{huang2024faster, abbr = {COLT}, bibtex_show = {true}, title = {Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo}, author = {Huang, Xunpeng and Zou, Difan and Dong, Hanze and Ma, Yian and Zhang, Tong}, booktitle = {Conference on Learning Theory (COLT)}, year = {2024}, html = {https://arxiv.org/abs/2401.06325}, selected = {true} }
ICML
Faster Sampling via Stochastic Gradient Proximal Sampler

Huang, Xunpeng, Zou, Difan, Ma, Yian, Dong, Hanze, and Zhang, Tong

In International Conference on Machine Learning (ICML) 2024

Abs Bib

Stochastic gradients have been widely integrated into Langevin-based methods to improve their scalability and efficiency in solving large-scale sampling problems. However, the proximal sampler, which exhibits much faster convergence than Langevin-based algorithms in the deterministic setting \citelee2021structured, has yet to be explored in its stochastic variants. In this paper, we study the Stochastic Proximal Samplers (SPS) for sampling from non-log-concave distributions. We first establish a general framework for implementing stochastic proximal samplers and establish the convergence theory accordingly. We show that the convergence to the target distribution can be guaranteed as long as the second moment of the algorithm trajectory is bounded and restricted Gaussian oracles can be well approximated. We then provide two implementable variants based on Stochastic gradient Langevin dynamics (SGLD) and Metropolis-adjusted Langevin algorithm (MALA), giving rise to SPS-SGLD and SPS-MALA. We further show that SPS-SGLD and SPS-MALA can achieve ε-sampling error in total variation (TV) distance within \tilde\mathcalO(dε^-2) and \tilde\mathcalO(d^1/2ε^-2) gradient complexities, which outperform the best-known result by at least an \tilde\mathcalO(d^1/3) factor. This enhancement in performance is corroborated by our empirical studies on synthetic data with various dimensions, demonstrating the efficiency of our proposed algorithm.
@inproceedings{huang2024sps, abbr = {ICML}, bibtex_show = {true}, title = {Faster Sampling via Stochastic Gradient Proximal Sampler}, author = {Huang, Xunpeng and Zou, Difan and Ma, Yian and Dong, Hanze and Zhang, Tong}, booktitle = {International Conference on Machine Learning (ICML)}, year = {2024}, selected = {true} }
ICLR
Reverse Diffusion Monte Carlo

Huang, Xunpeng, Dong, Hanze, HAO, Yifan, Ma, Yian, and Zhang, Tong

In International Conference on Learning Representations (ICLR) 2024

Abs Bib HTML

The efficacy of modern generative models is commonly contingent upon the precision of score estimation along the diffusion path, with a focus on diffusion models and their ability to generate high-quality data samples. This study delves into the application of reverse diffusion to Monte Carlo sampling. It is shown that score estimation can be transformed into a mean estimation problem via the decomposition of the transition kernel. By estimating the mean of the posterior distribution, we derive a novel Monte Carlo sampling algorithm from the reverse diffusion process, which is distinct from traditional Markov Chain Monte Carlo (MCMC) methods. We calculate the error requirements and sample size for the posterior distribution, and use the result to derive an algorithm that can approximate the target distribution to any desired accuracy. Additionally, by estimating the log-Sobolev constant of the posterior distribution, we show under suitable conditions the problem of sampling from the posterior can be easier than direct sampling from the target distribution using traditional MCMC techniques. For Gaussian mixture models, we demonstrate that the new algorithm achieves significant improvement over the traditional Langevin-style MCMC sampling methods both theoretically and practically. Our algorithm offers a new perspective and solution beyond classical MCMC algorithms for challenging complex distributions.
@inproceedings{huang2024reversf, abbr = {ICLR}, bibtex_show = {true}, title = {Reverse Diffusion Monte Carlo}, author = {Huang, Xunpeng and Dong, Hanze and HAO, Yifan and Ma, Yian and Zhang, Tong}, booktitle = {International Conference on Learning Representations (ICLR)}, year = {2024}, html = {https://openreview.net/forum?id=kIPEyMSdFV}, selected = {true} }

2021

AAAI
ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization

Huang, Xunpeng, Xu, Runxin, Zhou, Hao, Wang, Zhe, Liu, Zhengyang, and Li, Lei

In Proceedings of the AAAI Conference on Artificial Intelligence 2021

Abs Bib HTML

Stochastic gradient descent (SGD) is a widely used method for its outstanding generalization ability and simplicity. Adaptive gradient methods have been proposed to further accelerate the optimization process. In this paper, we revisit existing adaptive gradient optimization methods with a new interpretation. Such new perspective leads to a refreshed understanding of the roles of second moments in stochastic optimization. Based on this, we propose Angle-Calibration Moment method (ACMo), a novel stochastic optimization method. It enjoys the benefits of second moments with only first moment updates. Theoretical analysis shows that ACMo is able to achieve the same convergence rate as mainstream adaptive methods. Experiments on a variety of CV and NLP tasks demonstrate that ACMo has a comparable convergence to state-of-the-art Adam-type optimizers, and even a better generalization performance in most cases. The code is available at https://github.com/Xunpeng746/ACMo.
@inproceedings{huang2021acmo, abbr = {AAAI}, bibtex_show = {true}, title = {ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization}, author = {Huang, Xunpeng and Xu, Runxin and Zhou, Hao and Wang, Zhe and Liu, Zhengyang and Li, Lei}, booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence}, volume = {35}, number = {9}, pages = {7857--7864}, year = {2021}, html = {https://ojs.aaai.org/index.php/AAAI/article/view/16959} }

2020

AAAI
Span: A stochastic projected approximate newton method

Huang, Xunpeng, Liang, Xianfeng, Liu, Zhengyang, Li, Lei, Yu, Yue, and Li, Yitan

In Proceedings of the AAAI Conference on Artificial Intelligence 2020

Abs Bib HTML

Second-order optimization methods have desirable convergence properties. However, the exact Newton method requires expensive computation for the Hessian and its inverse. In this paper, we propose SPAN, a novel approximate and fast Newton method. SPAN computes the inverse of the Hessian matrix via low-rank approximation and stochastic Hessian-vector products. Our experiments on multiple benchmark datasets demonstrate that SPAN outperforms existing first-order and second-order optimization methods in terms of the convergence wall-clock time. Furthermore, we provide a theoretical analysis of the per-iteration complexity, the approximation error, and the convergence rate. Both the theoretical analysis and experimental results show that our proposed method achieves a better trade-off between the convergence rate and the per-iteration efficiency.
@inproceedings{huang2020span, abbr = {AAAI}, bibtex_show = {true}, title = {Span: A stochastic projected approximate newton method}, author = {Huang, Xunpeng and Liang, Xianfeng and Liu, Zhengyang and Li, Lei and Yu, Yue and Li, Yitan}, booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence}, volume = {34}, number = {02}, pages = {1520--1527}, year = {2020}, html = {https://ojs.aaai.org/index.php/AAAI/article/view/5511}, selected = {false} }

2019

2018

Inf. Sci
Finding potential lenders in P2P lending: A hybrid random walk approach

Zhang, Hefu, Zhao, Hongke, Liu, Qi, Xu, Tong, Chen, Enhong, and Huang, Xunpeng

Information Sciences 2018

Bib
@article{zhang2018finding, abbr = {Inf. Sci}, bibtex_show = {true}, title = {Finding potential lenders in P2P lending: A hybrid random walk approach}, author = {Zhang, Hefu and Zhao, Hongke and Liu, Qi and Xu, Tong and Chen, Enhong and Huang, Xunpeng}, journal = {Information Sciences}, volume = {432}, pages = {376--391}, year = {2018}, publisher = {Elsevier} }
AAAI
Confidence-aware matrix factorization for recommender systems

Wang, Chao, Liu, Qi, Wu, Runze, Chen, Enhong, Liu, Chuanren, Huang, Xunpeng, and Huang, Zhenya

In Proceedings of the AAAI Conference on Artificial Intelligence 2018

Bib
@inproceedings{wang2018confidence, abbr = {AAAI}, bibtex_show = {true}, title = {Confidence-aware matrix factorization for recommender systems}, author = {Wang, Chao and Liu, Qi and Wu, Runze and Chen, Enhong and Liu, Chuanren and Huang, Xunpeng and Huang, Zhenya}, booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence}, volume = {32}, number = {1}, year = {2018} }
DASFAA
Enhancing network embedding with auxiliary information: An explicit matrix factorization perspective

Guo, Junliang, Xu, Linli, Huang, Xunpeng, and Chen, Enhong

In International Conference on Database Systems for Advanced Applications 2018

Bib
@inproceedings{guo2018enhancing, abbr = {DASFAA}, bibtex_show = {true}, title = {Enhancing network embedding with auxiliary information: An explicit matrix factorization perspective}, author = {Guo, Junliang and Xu, Linli and Huang, Xunpeng and Chen, Enhong}, booktitle = {International Conference on Database Systems for Advanced Applications}, pages = {3--19}, year = {2018}, organization = {Springer} }

2017

IJCAI
Incremental Matrix Factorization: A Linear Feature Transformation Perspective.

Huang, Xunpeng, Wu, Le, Chen, Enhong, Zhu, Hengshu, Liu, Qi, and Wang, Yijun

In IJCAI 2017

Abs Bib HTML

Matrix Factorization (MF) is among the most widely used techniques for collaborative filtering based recommendation. Along this line, a critical demand is to incrementally refine the MF models when new ratings come in an online scenario. However, most of existing incremental MF algorithms are limited by specific MF models or strict use restrictions. In this paper, we propose a general incremental MF framework by designing a linear transformation of user and item latent vectors over time. This framework shows a relatively high accuracy with a computation and space efficient training process in an online scenario. Meanwhile, we explain the framework with a low-rank approximation perspective, and give an upper bound on the training error when this framework is used for incremental learning in some special cases. Finally, extensive experimental results on two real-world datasets clearly validate the effectiveness, efficiency and storage performance of the proposed framework.
@inproceedings{huang2017incremental, abbr = {IJCAI}, bibtex_show = {true}, title = {Incremental Matrix Factorization: A Linear Feature Transformation Perspective.}, author = {Huang, Xunpeng and Wu, Le and Chen, Enhong and Zhu, Hengshu and Liu, Qi and Wang, Yijun}, booktitle = {IJCAI}, pages = {1901--1908}, year = {2017}, html = {https://www.ijcai.org/proceedings/2017/0264.pdf} }