Ji Zhu

  1. On Consistency of Community Detection in Networks.

    Authors: Ji Zhu, Yunpeng Zhao, Elizaveta Levina
    Subjects: Statistics
    Abstract

    Community detection is a fundamental problem in network analysis, with
    applications in many diverse areas. The stochastic block model is a common tool
    for model-based community detection, and asymptotic tools for checking
    consistency of community detection under the block model have been recently
    developed. However, the block model is limited by its assumption that all nodes
    within a community are stochastically equivalent, and provides a poor fit to
    networks with hubs or highly varying node degrees within communities, which are
    common in practice.

  2. Random lasso.

    Authors: Ji Zhu, Bin Nan, Saharon Rosset, Sijian Wang
    Subjects: Applications
    Abstract

    We propose a computationally intensive method, the random lasso method, for
    variable selection in linear models. The method consists of two major steps. In
    step 1, the lasso method is applied to many bootstrap samples, each using a set
    of randomly selected covariates. A measure of importance is yielded from this
    step for each covariate. In step 2, a similar procedure to the first step is
    implemented with the exception that for each bootstrap sample, a subset of
    covariates is randomly selected with unequal selection probabilities determined
    by the covariates' importance.

  3. Quantifying Information Leakage in Finite Order Deterministic Programs.

    Authors: Ji Zhu, Mudhakar Srivatsa
    Subjects: Cryptography and Security
    Abstract

    Information flow analysis is a powerful technique for reasoning about the
    sensitive information exposed by a program during its execution.

  4. Group Variable Selection via a Hierarchical Lasso and Its Oracle Property.

    Authors: Ji Zhu, Nengfeng Zhou
    Subjects: Methodology
    Abstract

    In many engineering and scientific applications, prediction variables are
    grouped, for example, in biological applications where assayed genes or
    proteins can be grouped by biological roles or biological pathways. Common
    statistical analysis methods such as ANOVA, factor analysis, and functional
    modeling with basis sets also exhibit natural variable groupings.

  5. Community extraction for social networks.

    Authors: Ji Zhu, Yunpeng Zhao, Elizaveta Levina
    Subjects: Methodology
    Abstract

    Analysis of networks and in particular discovering communities within
    networks has been a focus of recent work in several fields, with applications
    ranging from citation and friendship networks to food webs and gene regulatory
    networks. Most of the existing community detection methods focus on
    partitioning the entire network into communities, with the expectation of many
    ties within communities and few ties between. However, many networks contain
    nodes that do not fit in with any of the communities, and forcing every node
    into a community can distort results.

  6. The Missing Piece Syndrome in Peer-to-Peer Communication.

    Authors: Ji Zhu, Bruce Hajek
    Subjects: Performance
    Abstract

    Typical protocols for peer-to-peer file sharing over the Internet divide
    files to be shared into pieces. New peers strive to obtain a complete
    collection of pieces from other peers and from a seed. In this paper we
    identify a problem that can occur if the seeding rate is not large enough. The
    problem is that, even if the statistics of the system are symmetric in the
    pieces, there can be symmetry breaking, with one piece becoming very rare. If
    peers depart after obtaining a complete collection, they can tend to leave
    before helping other peers receive the rare piece.

  7. Functional linear regression that's interpretable.

    Authors: Gareth M. James, Jing Wang, Ji Zhu
    Subjects: gr. Statistics
    Abstract

    Regression models to relate a scalar $Y$ to a functional predictor $X(t)$ are
    becoming increasingly common. Work in this area has concentrated on estimating
    a coefficient function, $\beta(t)$, with $Y$ related to $X(t)$ through
    $\int\beta(t)X(t) dt$. Regions where $\beta(t)\ne0$ correspond to places where
    there is a relationship between $X(t)$ and $Y$. Alternatively, points where
    $\beta(t)=0$ indicate no relationship.

Syndicate content