Docstrings
Useful Structs
MPSTime.Encoding
— TypeEncoding
Abstract supertype of all encodings. To specify an encoding for MPS training, set the encoding
keyword when calling MPSOptions
.
Example
julia> opts = MPSOptions(; encoding=:Legendre);
julia> W, info, test_states = fitMPS( X_train, y_train, X_test, y_test, opts);
Encodings
:Legendre
: The first d L2-normalised Legendre Polynomials. Real valued, and supports passingprojected_basis=true
toMPSOptions
.:Fourier
: Complex valued Fourier coefficients. Supports passingprojected_basis=true
toMPSOptions
.
\[ \Phi(x; d) = \left[1. + 0i, e^{i \pi x}, e^{-i \pi x}, e^{2i \pi x}, e^{-2i \pi x}, \ldots \right] / \sqrt{d} \]
:Stoudenmire
: The original complex valued "Spin-1/2" encoding from Stoudenmire & Schwab, 2017 arXiv. Only supports d = 2
\[ \Phi(x) = \left[ e^{3 i \pi x / 2} \cos(\frac{\pi}{2} x), e^{-3 i \pi x / 2} \sin(\frac{\pi}{2} x)\right]\]
:Sahand_Legendre_Time_Dependent
: (:SLTD) A custom, real-valued encoding constructed as a data-driven adaptation of the Legendre Polynomials. At each time point, $t$, the training data is used to construct a probability density function that describes the distribution of the time-series amplitude $x_t$. This is the first basis function.$b_1(x; t) = \text{pdf}_{x_t}(x_t)$. This is computed with KernelDensity.jl:
julia> Using KernelDensity
julia> xs_samps = range(-1,1, max(200,size(X_train,2)))
julia> b1(xs,t) = pdf(kde(X_train[t,:]), xs_samps)
The second basis function is the first order polynomial that is L2-orthogonal to this pdf on the interval [-1,1].
\[b_2(x;t) = a_1 x + a_0 \text{ where } \int_{-1}^1 b_1(x;t) b_2^*(x; t) \textrm{d} x = 0, \ \lvert\lvert b_2(x; t) \rvert\rvert_{L2} = 1\]
The third basis function is the second order polynomial that is L2-orthogonal to the first two basis functions on [-1,1], etc.
-:Custom
: For use with user-defined custom bases. See function_basis
MPSTime.EncodedTimeSeriesSet
— TypeEncodedTimeSeriesSet
Holds an encoded time-series dataset, as well as a copy of the original data and its class distribution.
MPSTime.TrainedMPS
— TypeTrainedMPS
Container for a trained MPS and its associated Options and training data.
Fields
mps::MPS
: A trained Matrix Product state.opts::MPSOptions
: User defined MPSOptions used to create the MPS.train_data::EncodedTimeSeriesSet
: Stores both the raw and encoded data used to train the mps.
Hyperparameters
MPSTime.AbstractMPSOptions
— TypeAbstractMPSOptions
Abstract supertype of "MPSOptions", a collection of concrete types which is used to specify options for training, and "Options", which is used internally and contains references to internal objects
MPSTime.MPSOptions
— TypeMPSOptions(; <Keyword Arguments>)
Set the hyperparameters and other options for fitMPS.
Fields:
Logging
verbosity::Int=1
: How much debug/progress info to print to the terminal while optimising the MPS. Higher numbers mean more outputlog_level::Int=3
: How much statistical output. 0 for nothing, >0 to print losses, accuracies, and confusion matrix at each step (noticeable) computational overhead) #TODO implement finer grain controltrack_cost::Bool=false
: Whether to print the cost at each Bond tensor site to the terminal while training, mostly useful for debugging new cost functions or optimisers (HUGE computational overhead)
MPS Training Hyperparameters
nsweeps::Int=5
: Number of MPS optimisation sweeps to perform (Both forwards and Backwards)chi_max::Int=25
: Maximum bond dimension allowed within the MPS during the SVD stepeta::Float64=0.01
: The learning rate. For gradient descent methods, this is the step size. For Optim and OptimKit this serves as the initial step size guess input into the linesearchd::Int=5
: The dimension of the feature map or "Encoding". This is the true maximum dimension of the feature vectors. For a splitting encoding, d = numsplits * auxbasis_dimcutoff::Float64=1E-10
: Size based cutoff for the number of singular values in the SVD (See Itensors SVD documentation)dtype::DataType=Float64 or ComplexF64 depending on encoding
: The datatype of the elements of the MPS. Supports the arbitrary precsion types such as BigFloat and Complex{BigFloat}exit_early::Bool=false
: Stops training if training accuracy is 1 at the end of any sweep.
Encoding Options
encoding::Symbol=:Legendre
: The encoding to use, including :Stoudenmire, :Fourier, :Legendre, :SLTD, :Custom, etc. see Encoding docs for a complete list. Can be just a time (in)dependent orthonormal basis, or a time (in)dependent basis mapped onto a number of "splits" which distribute tighter basis functions where the sites of a timeseries are more likely to be measured.projected_basis::Bool=false
: Whether toproject a basis onto the training data at each time. Normally, when specifying a basis of dimension d, the first d lowest order terms are used. When project=true, the training data is used to construct a pdf of the possible timeseries amplitudes at each time point. The first d largest terms of this pdf expanded in a series are used to select the basis terms.aux_basis_dim::Int=2
: Unused for standard encodings. If the encoding is a SplitBasis, serves as the auxilliary dimension of a basis mapped onto the split encoding, so that the number of histogram bins = d / auxbasisdim.encode_classes_separately::Bool=false
: Only relevant for data driven bases. If true, then data is split up by class before being encoded. Functionally, this causes the encoding method to vary depending on the class
Data Preprocessing and MPS initialisation
sigmoid_transform::Bool
: Whether to apply a sigmoid transform to the data before minmaxing. This has the form
\[\boldsymbol{X'} = \left(1 + \exp{-\frac{\boldsymbol{X}-m_{\boldsymbol{X}}}{r_{\boldsymbol{X}} / 1.35}}\right)^{-1}\]
where $\boldsymbol{X}$ is the un-normalized time-series data matrix, $m_{\boldsymbol{X}}$ is the median of $\boldsymbol{X}$ and $r_{\boldsymbol{X}}$is its interquartile range.
minmax::Bool
: Whether to apply a minmax norm to[0,1]
before encoding. This has the form
\[\boldsymbol{X'} = \frac{\boldsymbol{X} - x'_{\text{min}}}{x'_{\text{max}} - x'_{\text{min}}},\]
where $\boldsymbol{X''}$ is the scaled robust-sigmoid transformed data matrix, $x'_\text{min}$ and $x'_\text{max}$ are the minimum and maximum of $\boldsymbol{X'}$.
data_bounds::Tuple{Float64, Float64} = (0.,1.)
: The region to bound the data to if minmax=true. This is separate from the encoding domain. All encodings expect data to be scaled scaled between 0 and 1. Setting the data bounds a bit away from [0,1] can help when your basis has poor support near its boundaries.init_rng::Int
: Random seed used to generate the initial MPSchi_init::Int
: Initial bond dimension of the random MPS
Loss Functions and Optimisation Methods
loss_grad::Symbol=:KLD
: The type of cost function to use for training the MPS, typically Mean Squared Error (:MSE) or KL Divergence (:MSE), but can also be a weighted sum of the two (:Mixed)bbopt::Symbol=:TSGO
: Which local Optimiser to use, builtin options are symbol gradient descent (:GD), or gradient descent with a TSGO rule (:TSGO). Other options are Conjugate Gradient descent using either the Optim or OptimKit packages (:Optim or :OptimKit respectively). The CGD methods work well for MSE based loss functions, but seem to perform poorly for KLD base loss functions.rescale::Tuple{Bool,Bool}=(false,true)
: Has the formrescale = (before::Bool, after::Bool)
. Where to enforce the normalisation of the MPS during training, either calling normalise!(Bond Tensor) before or after BT is updated. Note that for an MPS that starts in canonical form, rescale = (true,true) will train identically to rescale = (false, true) but may be less performant.update_iters::Int=1
: Maximum number of optimiser iterations to perform for each bond tensor optimisation. E.G. The number of steps of (Conjugate) Gradient Descent used by TSGO, Optim or OptimKittrain_classes_separately::Bool=false
: Whether the the trainer optimises the total MPS loss over all classes or whether it considers each class as a separate problem. Should make very little diffence
Debug
return_encoding_meta_info::Bool=false
: Debug flag: Whether to return the normalised data as well as the histogram bins for the splitbasis types
Convert the internal Options type into a serialisable MPSOptions.
The internal Options type and the functions that help convert between
MPSTime.Options
— TypeOptions(; <Keyword Arguments>)
The internal options struct. Fields have the same meaning as MPSOptions, but contains objects instead of symbols, e.g. Encoding=Basis("Legendre") instead of :Legendre
MPSTime.safe_options
— Functionsafe_options(opts::AbstractMPSOptions)
Takes any AbstractMPSOptions type, and returns an instantiated Options type.
MPSTime.model_encoding
— Functionmodel_encoding(symb::Symbol, project::Bool=false)
Construct an Encoding object from symb. Not case sensitive. See Encodings documentation for the full list of options. Will use the specified project options if the encoding supports projecting. The inverse of symbolic_encoding.
MPSTime.symbolic_encoding
— Functionsymbolic_encoding(E::Encoding)
Construct a symbolic name from an Encoding object. The inverse of model_encoding
MPSTime.model_loss_func
— Functionmodel_loss_func(symb::Symbol)
Select a loss function (::Function) from the symb. Not case sensitive. The inverse of symboliclossfunc
MPSTime.model_bbopt
— Functionmodel_bbopt(symb::Symbol)
Constuct a BBOpt object from symb. Not case sensitive.