Let $y = X\beta + u$ be a regression model. If we assume $\mathbb V[u|X] = \sigma^2$ then does this imply $\mathbb E[u|X] = c$? Clearly, $\mathbb V[u|X] = \mathbb E[uu'|X] - \mathbb E[u|X]\mathbb E[u|X]'$, so $\mathbb E[uu'|X] = \sigma^2 + \mathbb E[u|X]\mathbb E[u|X]'$. Since both, $\mathbb E[uu'|X]$ and $\mathbb E[u|X]\mathbb E[u|X]'$ can vary, $\mathbb E[u|X]$ need not be constant. But what would be a counter example?
If we assume iid observations and only a single regressor, it suffices to regard a single $u_i$. So, $E[u_i^2|x_i] = \sigma^2 + \mathbb E[u_i|x_i]^2$. If we take the derivative wrt to $x_i$, we find $D\mathbb E[u_i^2|x_i] = 2D\mathbb E[u_i|x_i]$ so I thought about assuming $\mathbb E[u_i|x_i] = 0.5x_i$ and $\mathbb E[u^2_i|x_i] = x_i$. Is this reasoning correct or do I miss a point?
$\mathbb E[u|X]$ can be whatever you want. The variance assumption alone does not imply it must be anything. Most commonly, $\mathbb E[u|X]$ is assumed to be $0$. If this was not done, and instead you assumed it was equal to some parameter to be estimated, like $c$, then all of the parameters would probably not be identifiable, and they would be less interpretable.
If you assume $\mathbb E[u_i|x_i] = 0.5x_i$ then you can write your model as $$ y_i = \beta_0 + \beta_1 x_i + (.5x_i + \sigma z_i) $$ where $z_i$ is a standard normal variate. The parameters $(\beta_0, \beta_1, .5)$ yield the same likelihood as $(\beta_0, \beta_1 + .5, 0)$. You are not estimating $.5$, so technically your model is still identifiable; but I don't see any advantage to be gained from not following the convention of assuming your noise is mean zero.