LSDTopoTools
All Classes Files Functions Variables Friends Pages
Public Member Functions | Protected Attributes | List of all members
LSDMostLikelyPartitionsFinder Class Reference

This object looks for the most likeley partitions or segments of 2D data. More...

#include <LSDMostLikelyPartitionsFinder.hpp>

Public Member Functions

 LSDMostLikelyPartitionsFinder (int this_min_seg_length, vector< float > this_x_data, vector< float > this_y_data)
 Create LSDMostLikelyPartitionsFinder object. More...
 
int get_n_nodes ()
 
vector< float > get_MLE_of_segments ()
 
vector< float > get_x_data ()
 
vector< float > get_y_data ()
 
void reset_derived_data_members ()
 Resets all the derived data members. More...
 
void thin_data_target_dx_preserve_data (float dx)
 Collects data as close as possible to some target dx, but does not modify invidivual data points. More...
 
void thin_data_target_dx_preserve_data (float dx, vector< int > &node_ref)
 Collects data as close as possible to some target dx, but does not modify invidivual data points. More...
 
LSDMostLikelyPartitionsFinder spawn_thinned_data_target_dx_preserve_data (float dx)
 Collects data as close as possible to some target dx, but does not modify invidivual data points. More...
 
void thin_data_skip (int N, vector< int > &node_ref)
 This thins the data by skipping elements. More...
 
void thin_data_target_dx_linear_interpolation (float dx)
 Recasts data at fixed values of dx, using linear interpoloation. More...
 
LSDMostLikelyPartitionsFinder spawn_thinned_data_target_dx_linear_interpolation (float dx)
 Recasts data at fixed values of dx, using linear interpoloation. More...
 
void thin_data_monte_carlo_skip (int Mean_skip, int skip_range, vector< int > &node_ref)
 Skips nodes but using a Monte Carlo scheme that samples random points along the channel profile. More...
 
void thin_data_monte_carlo_dchi (float mean_dchi, float variation_dchi, vector< int > &node_ref)
 Thins object based on a monte carlo approach using a mean, max and minimum dchi. More...
 
void print_x_y_data_to_screen ()
 Function for looking at the x and y data. More...
 
void best_fit_driver_AIC_for_linear_segments (vector< float > sigma_values)
 Driver function to get best fit segments. More...
 
void best_fit_driver_AIC_for_linear_segments (float sigma_values)
 Driver function to get best fit segments. More...
 
void get_data_from_best_fit_lines (int node, vector< float > sigma_values, vector< float > &b_values, vector< float > &m_values, vector< float > &r2_values, vector< float > &DW_values, vector< float > &fitted_y, vector< int > &seg_lengths, float &this_MLE, int &this_n_segments, int &this_n_nodes, float &this_AIC, float &this_AICc)
 Function returns data for a given sigma value. More...
 
void get_start_and_end_x_for_segments (vector< float > &start_x, vector< float > &end_x, vector< int > seg_lengths)
 Replaces two vectors, which have the starting and ending position of the best fit segments for a given sigma. More...
 
void calculate_segment_matrices (float sigma)
 This function is used to calculate the slope, intercept, and likelihood of all possible linear segments along a series of data points. More...
 
void populate_segment_matrix (int start_node, int end_node, float no_data_value, float sigma)
 This function popultes the matrices of liklihood, m and b values. More...
 
void find_max_like_of_segments ()
 Function calculates the most likeley combination of segments given the liklihood of individual segments calcualted by the calculate_segment_matrices function. More...
 
void partition_driver_to_vecvecvec (int k)
 This function drives the partitioning algorithms. More...
 
void partitions_with_minimum_length (int n, int k, int t, vector< int > &p)
 An integer partition algorithm. More...
 
void partition_assign (int t, vector< int > &p)
 Function for use with the permutations this assigns values into the vecvecvec that contains all the partitioning information. More...
 
int LSDpartitions_min (int x, int y)
 Function for use with the permutations gets the mininum of two values. More...
 
Array2D< float > normalize_like_matrix_to_sigma_one (float sigma)
 This takes a likelihood array that has been calcualted with a given sigma value and normalizes the sigma values as though sigma was equal to 1. More...
 
vector< float > normalize_like_vector_to_sigma_one (float sigma, vector< float > like_vector)
 Normalizes but with vector data, for use with MLE vector for segments. More...
 
void change_normalized_like_matrix_to_new_sigma (float sigma, Array2D< float > &sig1_like_array)
 Takes a normalized likelihood array and updates the values to a new sigma value. More...
 
vector< float > change_normalized_like_vector_to_new_sigma (float sigma, vector< float > sig1_like_vector)
 Takes a normalized likelihood vector and updates the values to a new sigma value. More...
 
vector< float > transform_like_from_sigma1_to_sigma2 (float sigma1, vector< float > sig1_like_vector, float sigma2)
 Takes a normalized likelihood vector and updates the values to a new sigma value. More...
 
void get_n_segments_for_various_sigma (vector< float > sigma_values)
 Function takes the normalized MLE values (normalized with sigma = 1) and returns the best fit number of segments from both the AIC and the AICc measures. More...
 
void calculate_AIC_of_segments_with_variable_sigma (float sigma, vector< float > &AIC_of_segments, vector< float > &AICc_of_segments)
 Function calculates AIC and AICc of segments taking the maximum_MLE based on a sigma of one. More...
 
void get_properties_of_best_fit_segments (int bestfit_segments_node, vector< float > &m_values, vector< float > &b_values, vector< float > &r2_values, vector< float > &DW_values)
 Function extracts the m, b, r^2 and DW statistic of the most likeley segments. More...
 
void print_to_screen_most_likeley_segment_lengths ()
 Function prints the most likeley segment lengths to screen. More...
 
void print_AIC_and_AICc_to_screen (vector< float > sigma_values)
 Function prints AIC and AICc values to screen. More...
 

Protected Attributes

int minimum_segment_length
 Minimum segment length.
 
vector< float > x_data
 X data.
 
vector< float > y_data
 Y data.
 
float base_sigma
 The base sigma value from which the MLE of the segments is calcluated.
 
Array2D< float > like_array
 Liklihood array. Indexed so the row is the starting node and the column is the ending node.
 
Array2D< float > m_array
 Slope array. Indexed so the row is the starting node and the column is the ending node.
 
Array2D< float > b_array
 Intercept array. Indexed so the row is the starting node and the column is the ending node.
 
Array2D< float > rsquared_array
 R^2 array. Indexed so the row is the starting node and the column is the ending node.
 
Array2D< float > DW_array
 Array of Durbin-Watson statistics to test if the residuals are autocorrelated. More...
 
vector< float > MLE_of_segments
 Maximum likelihood of the different number of segments.
 
vector< vector< int > > segments_for_each_n_segments
 Each element of this vector contains the most likeley segments for that number of segments. More...
 
vector< vector< vector< int > > > partitions
 This is vecvecvec. More...
 
vector< int > best_fit_AIC
 Vector of best fit AIC values.
 
vector< int > best_fit_AICc
 Vector of best fit AICc values.
 
vector< vector< float > > AIC_for_each_n_segments
 Vector of vectors of AIC values.
 
vector< vector< float > > AICc_for_each_n_segments
 Vector of vectors of AICc values.
 

Detailed Description

This object looks for the most likeley partitions or segments of 2D data.

Constructor & Destructor Documentation

LSDMostLikelyPartitionsFinder::LSDMostLikelyPartitionsFinder ( int  this_min_seg_length,
vector< float >  this_x_data,
vector< float >  this_y_data 
)
inline

Create LSDMostLikelyPartitionsFinder object.

Parameters
this_min_seg_lengthint Minimum segement length required.
this_x_datavector<float> X data.
this_y_datavector<float> Y data.
Author
SMM
Date
01/03/13

Member Function Documentation

void LSDMostLikelyPartitionsFinder::best_fit_driver_AIC_for_linear_segments ( vector< float >  sigma_values)

Driver function to get best fit segments.

Parameters
sigma_valuesvector<float> a vector containing sigma values for each node
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::best_fit_driver_AIC_for_linear_segments ( float  sigma_values)

Driver function to get best fit segments.

Parameters
sigma_values
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::calculate_AIC_of_segments_with_variable_sigma ( float  sigma,
vector< float > &  AIC_of_segments,
vector< float > &  AICc_of_segments 
)

Function calculates AIC and AICc of segments taking the maximum_MLE based on a sigma of one.

Parameters
sigmaStandard deviation of error.
AIC_of_segments
AICc_of_segments
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::calculate_segment_matrices ( float  sigma)

This function is used to calculate the slope, intercept, and likelihood of all possible linear segments along a series of data points.

The function requires the data in x and y vectors, the maximum segment length and sigma, the standard deviation of the measured data. This will be approxamately the error in the surface elevation, although it might have to be increased simply because likelihood will tend to zero if this is too small. sigma should also be considered to contain the 'noise' inherent in channel incision so perhaps 1-5 metres is appropriate the maximum segment length is an integer: it is the number of data points used. these data points from raw chi data are irregularly spaced so two segments of the same 'length' can have different lengths in chi space. One remedey for this is a preprocessor that places the zeta vs chi data along evenly spaced points.

The routine generates three matrices. The row of the matrix is the starting node of the segment. The column of the matrix is the ending node of the segment. Thus the routine will generate a matrix that is dimension n x n where n is the number of data points.

One potential future development is to implement this using a sparse matrix from the boost mtl library to reduce the memory usage.

Parameters
sigmaStandard deviation of error.
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::change_normalized_like_matrix_to_new_sigma ( float  sigma,
Array2D< float > &  sig1_like_array 
)

Takes a normalized likelihood array and updates the values to a new sigma value.

Parameters
sigmaStandard deviation of error.
sig1_like_array
Author
SMM
Date
01/03/13
vector< float > LSDMostLikelyPartitionsFinder::change_normalized_like_vector_to_new_sigma ( float  sigma,
vector< float >  sig1_like_vector 
)

Takes a normalized likelihood vector and updates the values to a new sigma value.

Parameters
sigmaStandard deviation of error.
sig1_like_vector
Returns
Updated sigma vector.
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::find_max_like_of_segments ( )

Function calculates the most likeley combination of segments given the liklihood of individual segments calcualted by the calculate_segment_matrices function.

Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::get_data_from_best_fit_lines ( int  node,
vector< float >  sigma_values,
vector< float > &  b_values,
vector< float > &  m_values,
vector< float > &  r2_values,
vector< float > &  DW_values,
vector< float > &  fitted_y,
vector< int > &  seg_lengths,
float &  this_MLE,
int &  this_n_segments,
int &  this_n_nodes,
float &  this_AIC,
float &  this_AICc 
)

Function returns data for a given sigma value.

this_MLE, this_n_segments and this_n_nodes are all returned so the user can combine two or more segments and get an AIC or AICc.

Parameters
nodeindex into the sigma vector. This requires a bit of explantion: because sigma can be extracted from the product that calculates MLE as a constant we can have multiple values of sigma and recalculate MLE for any given sigma When this code was written we added a function to modify the sigma values. In this case there are no normalisation values applied so if you use a random sigma here you will get the base sigma (supplied to the main computation function)
sigma_values
b_valuesA vector containing the b value (intercept) of each segment
m_valuesA vector containing the m value (slope) of each segment
r2_valuesA vector containing the r^2 value of each segment
DW_valuesA vector containing the durbin-watson value of each segment
fitted_yA vector containing the y value of each segment
seg_lengthsA vector with the lengths of each segment
this_MLEThe MLE of the best fit.
this_n_segmentsThe number of segments
this_n_nodesThe number of total nodes
this_AICThe Aikake information criterion
this_AICcThe corrected Aikake information criterion (needed for finite sample size)
Author
SMM
Date
01/03/13
vector<float> LSDMostLikelyPartitionsFinder::get_MLE_of_segments ( )
inline
Returns
Maximum Likelihood Estimator of segments.
int LSDMostLikelyPartitionsFinder::get_n_nodes ( )
inline
Returns
number of nodes.
void LSDMostLikelyPartitionsFinder::get_n_segments_for_various_sigma ( vector< float >  sigma_values)

Function takes the normalized MLE values (normalized with sigma = 1) and returns the best fit number of segments from both the AIC and the AICc measures.

It also returns two vector of vectors which are the AIC values for the varius values of sigma passed to the function in the sigma values vector.

Parameters
sigma_valuesvector of sigma values.
void LSDMostLikelyPartitionsFinder::get_properties_of_best_fit_segments ( int  bestfit_segments_node,
vector< float > &  m_values,
vector< float > &  b_values,
vector< float > &  r2_values,
vector< float > &  DW_values 
)

Function extracts the m, b, r^2 and DW statistic of the most likeley segments.

Parameters
bestfit_segments_node
m_values
b_values
r2_values
DW_values
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::get_start_and_end_x_for_segments ( vector< float > &  start_x,
vector< float > &  end_x,
vector< int >  seg_lengths 
)

Replaces two vectors, which have the starting and ending position of the best fit segments for a given sigma.

This gets data from (most likeley) the get_data_from_best_fit_lines function.

Parameters
start_x
end_x
seg_lengths
Author
SMM
Date
01/03/13
vector<float> LSDMostLikelyPartitionsFinder::get_x_data ( )
inline
Returns
Vector of X data.
vector<float> LSDMostLikelyPartitionsFinder::get_y_data ( )
inline
Returns
Vector of Y data.
int LSDMostLikelyPartitionsFinder::LSDpartitions_min ( int  x,
int  y 
)

Function for use with the permutations gets the mininum of two values.

Parameters
x
y
Returns
Minimum of the two values.
Author
SMM
Date
01/03/13
Array2D< float > LSDMostLikelyPartitionsFinder::normalize_like_matrix_to_sigma_one ( float  sigma)

This takes a likelihood array that has been calcualted with a given sigma value and normalizes the sigma values as though sigma was equal to 1.

Parameters
sigmaStandard deviation of error.
Returns
Normalized sigma matrix.
Author
SMM
Date
01/03/13
vector< float > LSDMostLikelyPartitionsFinder::normalize_like_vector_to_sigma_one ( float  sigma,
vector< float >  like_vector 
)

Normalizes but with vector data, for use with MLE vector for segments.

Parameters
sigmaStandard deviation of error.
like_vectorLikelihood vector.
Returns
Normalized sigma vector.
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::partition_assign ( int  t,
vector< int > &  p 
)

Function for use with the permutations this assigns values into the vecvecvec that contains all the partitioning information.

Parameters
t
p
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::partition_driver_to_vecvecvec ( int  k)

This function drives the partitioning algorithms.

Parameters
kNumber of elements in the partition.
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::partitions_with_minimum_length ( int  n,
int  k,
int  t,
vector< int > &  p 
)

An integer partition algorithm.

Algorithm and original Pascal implementation: Frank Ruskey, 1995.
Translation to C: Joe Sawada, 1997
grabbed from http://theory.cs.uvic.ca/inf/nump/NumPartition.html
adapted smm 21/12/2012
algorith described in
http://mathworld.wolfram.com/PartitionFunctionP.html
and
Skiena, S. Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Reading, MA: Addison-Wesley, 1990.
This is a further adaptation that only presents solution to the partition with segments of a minimum length it stores all the partitions.
http://www.codeguru.com/cpp/cpp/algorithms/article.php/c5123/Permutations-in-C.htm
http://www.cplusplus.com/reference/algorithm/next_permutation/
http://mdm4u1.wetpaint.com/page/4.3+Permutations+with+Some+Identical+Elements

Parameters
n
k
t
p
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::populate_segment_matrix ( int  start_node,
int  end_node,
float  no_data_value,
float  sigma 
)

This function popultes the matrices of liklihood, m and b values.

It is a recursive algorithm so in fact it doesn't just get one row but drills down through all the possible starting nodes to complete the matrix.

Parameters
start_node
end_node
no_data_valueNo data value
sigmaStandard deviation of error.
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::print_AIC_and_AICc_to_screen ( vector< float >  sigma_values)

Function prints AIC and AICc values to screen.

Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::print_to_screen_most_likeley_segment_lengths ( )

Function prints the most likeley segment lengths to screen.

Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::print_x_y_data_to_screen ( )

Function for looking at the x and y data.

Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::reset_derived_data_members ( )

Resets all the derived data members.

Derived data members are AIC, AICc, fit statistics, liklihoods (in the liklihood matrix, etc.) Basically everything except for the x and y data.

Author
SMM
Date
01/03/13
LSDMostLikelyPartitionsFinder LSDMostLikelyPartitionsFinder::spawn_thinned_data_target_dx_linear_interpolation ( float  dx)

Recasts data at fixed values of dx, using linear interpoloation.

Parameters
dxTarget dx for thinning.
Returns
New thinned LSDMostLikelyPartitionsFinder object.
Author
SMM
Date
01/03/13
LSDMostLikelyPartitionsFinder LSDMostLikelyPartitionsFinder::spawn_thinned_data_target_dx_preserve_data ( float  dx)

Collects data as close as possible to some target dx, but does not modify invidivual data points.

Parameters
dxTarget dx for thinning.
Returns
New thinned LSDMostLikelyPartitionsFinder object.
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::thin_data_monte_carlo_dchi ( float  mean_dchi,
float  variation_dchi,
vector< int > &  node_ref 
)

Thins object based on a monte carlo approach using a mean, max and minimum dchi.

Parameters
mean_dchi
variation_dchi
node_refAn index vector of the data points that were selected.
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::thin_data_monte_carlo_skip ( int  Mean_skip,
int  skip_range,
vector< int > &  node_ref 
)

Skips nodes but using a Monte Carlo scheme that samples random points along the channel profile.

Parameters
Mean_skip
skip_range
node_refAn index vector of the data points that were selected.
Author
SMM
Date
01/05/13
void LSDMostLikelyPartitionsFinder::thin_data_skip ( int  N,
vector< int > &  node_ref 
)

This thins the data by skipping elements.

A positive number N means it skips N elements after each element.

A negative number -N means that after N elements it skips one element.

Parameters
Nskip value.
node_refAn index vector of the data points that were selected.
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::thin_data_target_dx_linear_interpolation ( float  dx)

Recasts data at fixed values of dx, using linear interpoloation.

Parameters
dxTarget dx for thinning.
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::thin_data_target_dx_preserve_data ( float  dx)

Collects data as close as possible to some target dx, but does not modify invidivual data points.

Parameters
dxTarget dx for thinning.
Author
SMM
Date
01/03/13
void LSDMostLikelyPartitionsFinder::thin_data_target_dx_preserve_data ( float  dx,
vector< int > &  node_ref 
)

Collects data as close as possible to some target dx, but does not modify invidivual data points.

Parameters
dxTarget dx for thinning.
node_refAn index vector of the data points that were selected.
Author
SMM
Date
01/03/13
vector< float > LSDMostLikelyPartitionsFinder::transform_like_from_sigma1_to_sigma2 ( float  sigma1,
vector< float >  sig1_like_vector,
float  sigma2 
)

Takes a normalized likelihood vector and updates the values to a new sigma value.

Parameters
sigma1Standard deviation of error.
sig1_like_vector
sigma2Standard deviation of error.
Returns
Updated sigma vector.
Author
SMM
Date
01/03/13

Member Data Documentation

Array2D<float> LSDMostLikelyPartitionsFinder::DW_array
protected

Array of Durbin-Watson statistics to test if the residuals are autocorrelated.

Used to determine if the segment is truly linear. Values less than 1 indicate that the segment is probably not linear values < 1.5 should arouse suspicion. Indexed so the row is the starting node and the column is the ending node.

vector< vector < vector<int> > > LSDMostLikelyPartitionsFinder::partitions
protected

This is vecvecvec.

Top index references the number of segments. Second layer loops through the possible partitions of that number of segments. Third layer is the individual segment lengths.

vector< vector<int> > LSDMostLikelyPartitionsFinder::segments_for_each_n_segments
protected

Each element of this vector contains the most likeley segments for that number of segments.

So for example segments_for_each_n_segments[3] is a vector containing the lengths of the most likeley segments for 4 segments (note 0 indexing).


The documentation for this class was generated from the following files: