The combination of galaxy-galaxy lensing (GGL) with galaxy clustering is one of the most promising routes to determining the amplitude of matter clustering at low redshifts. We show that extending clustering+GGL analyses from the linear regime down to $\approx 0.5 h^{-1} \mathrm{Mpc}$ scales increases their constraining power considerably, even after marginalizing over a flexible model of non-linear galaxy bias. Using a grid of cosmological N-body simulations, we construct a Taylor-expansion emulator that predicts the galaxy autocorrelation $\xi_{gg}(r)$ and galaxy-matter cross-correlation $\xi_{gm}(r)$ as a function of $\sigma_8$, $\Omega_m$, and halo occupation distribution (HOD) parameters, which are allowed to vary with large-scale environment to represent possible effects of galaxy assembly bias. We present forecasts for a fiducial case that corresponds to BOSS LOWZ galaxy clustering and SDSS-depth weak lensing (effective source density $\sim 0.3 \mathrm{arcmin}^{-2}$). Using tangential shear and projected correlation function measurements over $0.5 \leq r_p \leq 30 h^{-1} \mathrm{Mpc}$ yields a 2 per cent constraint on the parameter combination $\sigma_8 \Omega_m^{0.6}$, a factor of two better than a constraint that excludes non-linear scales ($r_p > 2 h^{-1} \mathrm{Mpc}, 4 h^{-1} \mathrm{Mpc}$ for $\gamma_t , w_p$). Much of this improvement comes from the non-linear clustering information, which breaks degeneracies among HOD parameters. Increasing the effective source density to $3 \mathrm{arcmin}^{-2}$ sharpens the constraint on $\sigma_8 \Omega_m^{0.6}$ by a further factor of two. With robust modelling into the non-linear regime, low-redshift measurements of matter clustering at the 1-per cent level with clustering+GGL alone are well within reach of current data sets such as those provided by the Dark Energy Survey.