nntoolbox.utils.gradient module¶
-
nntoolbox.utils.gradient.
compute_gradient
(output: torch.Tensor, model: torch.nn.modules.module.Module) → List[torch.Tensor][source]¶ Comput gradient of the output of a model
- Parameters
output –
model –
- Returns
list of gradients of model parameters
-
nntoolbox.utils.gradient.
compute_gradient_norm
(output: torch.Tensor, model: torch.nn.modules.module.Module)[source]¶ Compute the norm of the gradient of an output (e.g a loss) with respect to a model parameters
- Parameters
output –
model –
- Returns
-
nntoolbox.utils.gradient.
compute_jacobian
(input: torch.Tensor, fn: Callable[[torch.Tensor], torch.Tensor], is_batch: bool = True, requires_grad: bool = True) → torch.Tensor[source]¶ Compute the jacobian of function(input) with respect to input. For most purpose, should use v2
- Parameters
output –
input – assume that input require_grad = True
fn –
batch – whether to compute gradient by batch
- Returns
-
nntoolbox.utils.gradient.
compute_jacobian_v2
(output: torch.Tensor, input: Union[torch.Tensor, Iterable[torch.Tensor]], requires_grad: bool = True) → Union[torch.Tensor, Iterable[torch.Tensor]][source]¶ Compute the jacobian of a vector with respect to an input tensor
- Parameters
output – a 1D vector of length L
input – either a tensor (parameter) or an iterable of paramters
requires_grad – whether output should be differentiable
- Returns
jacobian
-
nntoolbox.utils.gradient.
gather_flat_grad
(params: Iterable[torch.Tensor]) → torch.Tensor[source]¶ Gather gradient of all the parameters and flatten into a vector. Adapted from pytorch’s L-BFGS implementation.
- Parameters
params – List of parameters
- Returns
gradient vector of the parameters
-
nntoolbox.utils.gradient.
hessian_diagonal
(output: torch.Tensor, input: Union[torch.Tensor, Iterable], requires_grad: bool = True) → Union[torch.Tensor, List[torch.Tensor]][source]¶ Compute the diagonal of the hessian
- Parameters
output – a scalar tensor
input – either a tensor (parameter), or a list/generator of parameters
requires_grad – whether output should be differentiable
- Returns
a tensor (parameter), or a list/generator of parameters, denoting the diagonal of hessian of output
with respect to input