libMesh::StatisticsVector< T > Class Template Reference

A std::vector derived class for implementing simple statistical algorithms. More...

#include <statistics.h>

Inheritance diagram for libMesh::StatisticsVector< T >:

Public Member Functions

 StatisticsVector (dof_id_type i=0)
 
 StatisticsVector (dof_id_type i, T val)
 
virtual ~StatisticsVector ()
 
virtual Real l2_norm () const
 
virtual T minimum () const
 
virtual T maximum () const
 
virtual Real mean () const
 
virtual Real median ()
 
virtual Real median () const
 
virtual Real variance () const
 
virtual Real variance (const Real known_mean) const
 
virtual Real stddev () const
 
virtual Real stddev (const Real known_mean) const
 
void normalize ()
 
virtual void histogram (std::vector< dof_id_type > &bin_members, unsigned int n_bins=10)
 
void plot_histogram (const processor_id_type my_procid, const std::string &filename, unsigned int n_bins)
 
virtual void histogram (std::vector< dof_id_type > &bin_members, unsigned int n_bins=10) const
 
virtual std::vector< dof_id_typecut_below (Real cut) const
 
virtual std::vector< dof_id_typecut_above (Real cut) const
 

Detailed Description

template<typename T>
class libMesh::StatisticsVector< T >

A std::vector derived class for implementing simple statistical algorithms.

The StatisticsVector class is derived from the std::vector<> and therefore has all of its useful features. It was designed to not have any internal state, i.e. no public or private data members. Also, it was only designed for classes and types for which the operators +,*,/ have meaining, specifically floats, doubles, ints, etc. The main reason for this design decision was to allow a std::vector<> to be successfully cast to a StatisticsVector, thereby enabling its additional functionality. We do not anticipate any problems with deriving from an stl container which lacks a virtual destructor in this case.

Where manipulation of the data set was necessary (for example sorting) two versions of member functions have been implemented. The non-const versions perform sorting directly in the data set, invalidating pointers and changing the entries. const versions of the same functions are generally available, and will be automatically invoked on const StatisticsVector objects. A draw-back to the const versions is that they simply make a copy of the original object and therefore double the original memory requirement for the data set.

Most of the actual code was copied or adapted from the GNU Scientific Library (GSL). More precisely, the recursion relations for computing the mean were implemented in order to avoid possible problems with buffer overruns.

Author
John W. Peterson
Date
2002

Definition at line 76 of file statistics.h.

Constructor & Destructor Documentation

template<typename T>
libMesh::StatisticsVector< T >::StatisticsVector ( dof_id_type  i = 0)
inlineexplicit

Call the std::vector constructor.

Definition at line 84 of file statistics.h.

84 : std::vector<T> (i) {}
template<typename T>
libMesh::StatisticsVector< T >::StatisticsVector ( dof_id_type  i,
val 
)
inline

Call the std::vector constructor, fill each entry with val

Definition at line 89 of file statistics.h.

89 : std::vector<T> (i,val) {}
template<typename T>
virtual libMesh::StatisticsVector< T >::~StatisticsVector ( )
inlinevirtual

Destructor. Virtual so we can derive from the StatisticsVector

Definition at line 94 of file statistics.h.

94 {}

Member Function Documentation

template<typename T >
std::vector< dof_id_type > libMesh::StatisticsVector< T >::cut_above ( Real  cut) const
virtual

Returns a vector of dof_id_types which correspond to the indices of every member of the data set above the cutoff value cut. I chose not to combine these two functions since the interface is cleaner with one passed parameter instead of two.

Reimplemented in libMesh::ErrorVector.

Definition at line 349 of file statistics.C.

Referenced by libMesh::StatisticsVector< ErrorVectorReal >::stddev().

350 {
351  LOG_SCOPE ("cut_above()", "StatisticsVector");
352 
353  const dof_id_type n = cast_int<dof_id_type>(this->size());
354 
355  std::vector<dof_id_type> cut_indices;
356  cut_indices.reserve(n/2); // Arbitrary
357 
358  for (dof_id_type i=0; i<n; i++)
359  if ((*this)[i] > cut)
360  cut_indices.push_back(i);
361 
362  return cut_indices;
363 }
uint8_t dof_id_type
Definition: id_types.h:64
template<typename T >
std::vector< dof_id_type > libMesh::StatisticsVector< T >::cut_below ( Real  cut) const
virtual

Returns a vector of dof_id_types which correspond to the indices of every member of the data set below the cutoff value "cut".

Reimplemented in libMesh::ErrorVector.

Definition at line 325 of file statistics.C.

Referenced by libMesh::StatisticsVector< ErrorVectorReal >::stddev().

326 {
327  LOG_SCOPE ("cut_below()", "StatisticsVector");
328 
329  const dof_id_type n = cast_int<dof_id_type>(this->size());
330 
331  std::vector<dof_id_type> cut_indices;
332  cut_indices.reserve(n/2); // Arbitrary
333 
334  for (dof_id_type i=0; i<n; i++)
335  {
336  if ((*this)[i] < cut)
337  {
338  cut_indices.push_back(i);
339  }
340  }
341 
342  return cut_indices;
343 }
uint8_t dof_id_type
Definition: id_types.h:64
template<typename T >
void libMesh::StatisticsVector< T >::histogram ( std::vector< dof_id_type > &  bin_members,
unsigned int  n_bins = 10 
)
virtual

Computes and returns a histogram with n_bins bins for the data set. For simplicity, the bins are assumed to be of uniform size. Upon return, the bin_members vector will contain unsigned integers which give the number of members in each bin. WARNING: This non-const function sorts the vector, changing its order. Source: GNU Scientific Library

Definition at line 178 of file statistics.C.

References end, libMesh::libmesh_assert(), std::max(), std::min(), libMesh::out, and libMesh::Real.

Referenced by libMesh::StatisticsVector< T >::histogram(), and libMesh::StatisticsVector< ErrorVectorReal >::stddev().

180 {
181  // Must have at least 1 bin
182  libmesh_assert (n_bins>0);
183 
184  const dof_id_type n = cast_int<dof_id_type>(this->size());
185 
186  std::sort(this->begin(), this->end());
187 
188  // The StatisticsVector can hold both integer and float types.
189  // We will define all the bins, etc. using Reals.
190  Real min = static_cast<Real>(this->minimum());
191  Real max = static_cast<Real>(this->maximum());
192  Real bin_size = (max - min) / static_cast<Real>(n_bins);
193 
194  LOG_SCOPE ("histogram()", "StatisticsVector");
195 
196  std::vector<Real> bin_bounds(n_bins+1);
197  for (std::size_t i=0; i<bin_bounds.size(); i++)
198  bin_bounds[i] = min + i * bin_size;
199 
200  // Give the last bin boundary a little wiggle room: we don't want
201  // it to be just barely less than the max, otherwise our bin test below
202  // may fail.
203  bin_bounds.back() += 1.e-6 * bin_size;
204 
205  // This vector will store the number of members each bin has.
206  bin_members.resize(n_bins);
207 
208  dof_id_type data_index = 0;
209  for (std::size_t j=0; j<bin_members.size(); j++) // bin vector indexing
210  {
211  // libMesh::out << "(debug) Filling bin " << j << std::endl;
212 
213  for (dof_id_type i=data_index; i<n; i++) // data vector indexing
214  {
215  //libMesh::out << "(debug) Processing index=" << i << std::endl;
216  Real current_val = static_cast<Real>( (*this)[i] );
217 
218  // There may be entries in the vector smaller than the value
219  // reported by this->minimum(). (e.g. inactive elements in an
220  // ErrorVector.) We just skip entries like that.
221  if ( current_val < min )
222  {
223  // libMesh::out << "(debug) Skipping entry v[" << i << "]="
224  // << (*this)[i]
225  // << " which is less than the min value: min="
226  // << min << std::endl;
227  continue;
228  }
229 
230  if ( current_val > bin_bounds[j+1] ) // if outside the current bin (bin[j] is bounded
231  // by bin_bounds[j] and bin_bounds[j+1])
232  {
233  // libMesh::out.precision(16);
234  // libMesh::out.setf(std::ios_base::fixed);
235  // libMesh::out << "(debug) (*this)[i]= " << (*this)[i]
236  // << " is greater than bin_bounds[j+1]="
237  // << bin_bounds[j+1] << std::endl;
238  data_index = i; // start searching here for next bin
239  break; // go to next bin
240  }
241 
242  // Otherwise, increment current bin's count
243  bin_members[j]++;
244  // libMesh::out << "(debug) Binned index=" << i << std::endl;
245  }
246  }
247 
248 #ifdef DEBUG
249  // Check the number of binned entries
250  const dof_id_type n_binned = std::accumulate(bin_members.begin(),
251  bin_members.end(),
252  static_cast<dof_id_type>(0),
253  std::plus<dof_id_type>());
254 
255  if (n != n_binned)
256  {
257  libMesh::out << "Warning: The number of binned entries, n_binned="
258  << n_binned
259  << ", did not match the total number of entries, n="
260  << n << "." << std::endl;
261  }
262 #endif
263 }
virtual T maximum() const
Definition: statistics.C:61
virtual T minimum() const
Definition: statistics.C:48
IterBase * end
long double max(long double a, double b)
libmesh_assert(j)
DIE A HORRIBLE DEATH HERE typedef LIBMESH_DEFAULT_SCALAR_TYPE Real
OStreamProxy out(std::cout)
long double min(long double a, double b)
uint8_t dof_id_type
Definition: id_types.h:64
template<typename T >
void libMesh::StatisticsVector< T >::histogram ( std::vector< dof_id_type > &  bin_members,
unsigned int  n_bins = 10 
) const
virtual

A const version of the histogram function.

Definition at line 313 of file statistics.C.

References libMesh::StatisticsVector< T >::histogram().

315 {
316  StatisticsVector<T> sv = (*this);
317 
318  return sv.histogram(bin_members, n_bins);
319 }
template<typename T >
Real libMesh::StatisticsVector< T >::l2_norm ( ) const
virtual

Returns the l2 norm of the data set.

Definition at line 36 of file statistics.C.

References libMesh::Real.

Referenced by libMesh::StatisticsVector< ErrorVectorReal >::~StatisticsVector().

37 {
38  Real normsq = 0.;
39  const dof_id_type n = cast_int<dof_id_type>(this->size());
40  for (dof_id_type i = 0; i != n; ++i)
41  normsq += ((*this)[i] * (*this)[i]);
42 
43  return std::sqrt(normsq);
44 }
DIE A HORRIBLE DEATH HERE typedef LIBMESH_DEFAULT_SCALAR_TYPE Real
uint8_t dof_id_type
Definition: id_types.h:64
template<typename T >
T libMesh::StatisticsVector< T >::maximum ( ) const
virtual

Returns the maximum value in the data set.

Definition at line 61 of file statistics.C.

References end, and std::max().

Referenced by libMesh::StatisticsVector< ErrorVectorReal >::~StatisticsVector().

62 {
63  LOG_SCOPE ("maximum()", "StatisticsVector");
64 
65  const T max = *(std::max_element(this->begin(), this->end()));
66 
67  return max;
68 }
IterBase * end
long double max(long double a, double b)
template<typename T >
Real libMesh::StatisticsVector< T >::mean ( ) const
virtual

Returns the mean value of the data set using a recurrence relation. Source: GNU Scientific Library

Reimplemented in libMesh::ErrorVector.

Definition at line 74 of file statistics.C.

References libMesh::Real.

Referenced by libMesh::StatisticsVector< ErrorVectorReal >::variance(), and libMesh::StatisticsVector< ErrorVectorReal >::~StatisticsVector().

75 {
76  LOG_SCOPE ("mean()", "StatisticsVector");
77 
78  const dof_id_type n = cast_int<dof_id_type>(this->size());
79 
80  Real the_mean = 0;
81 
82  for (dof_id_type i=0; i<n; i++)
83  {
84  the_mean += ( static_cast<Real>((*this)[i]) - the_mean ) /
85  static_cast<Real>(i + 1);
86  }
87 
88  return the_mean;
89 }
DIE A HORRIBLE DEATH HERE typedef LIBMESH_DEFAULT_SCALAR_TYPE Real
uint8_t dof_id_type
Definition: id_types.h:64
template<typename T >
Real libMesh::StatisticsVector< T >::median ( )
virtual

Returns the median (e.g. the middle) value of the data set. This function modifies the original data by sorting, so it can't be called on const objects. Source: GNU Scientific Library

Reimplemented in libMesh::ErrorVector.

Definition at line 95 of file statistics.C.

References end, and libMesh::Real.

Referenced by libMesh::ErrorVector::median(), libMesh::StatisticsVector< T >::median(), and libMesh::StatisticsVector< ErrorVectorReal >::~StatisticsVector().

96 {
97  const dof_id_type n = cast_int<dof_id_type>(this->size());
98 
99  if (n == 0)
100  return 0.;
101 
102  LOG_SCOPE ("median()", "StatisticsVector");
103 
104  std::sort(this->begin(), this->end());
105 
106  const dof_id_type lhs = (n-1) / 2;
107  const dof_id_type rhs = n / 2;
108 
109  Real the_median = 0;
110 
111 
112  if (lhs == rhs)
113  {
114  the_median = static_cast<Real>((*this)[lhs]);
115  }
116 
117  else
118  {
119  the_median = ( static_cast<Real>((*this)[lhs]) +
120  static_cast<Real>((*this)[rhs]) ) / 2.0;
121  }
122 
123  return the_median;
124 }
IterBase * end
DIE A HORRIBLE DEATH HERE typedef LIBMESH_DEFAULT_SCALAR_TYPE Real
uint8_t dof_id_type
Definition: id_types.h:64
template<typename T >
Real libMesh::StatisticsVector< T >::median ( ) const
virtual

A const version of the median funtion. Requires twice the memory of original data set but does not change the original.

Reimplemented in libMesh::ErrorVector.

Definition at line 130 of file statistics.C.

References libMesh::StatisticsVector< T >::median().

131 {
132  StatisticsVector<T> sv = (*this);
133 
134  return sv.median();
135 }
template<typename T >
T libMesh::StatisticsVector< T >::minimum ( ) const
virtual

Returns the minimum value in the data set.

Reimplemented in libMesh::ErrorVector.

Definition at line 48 of file statistics.C.

References end, and std::min().

Referenced by libMesh::StatisticsVector< ErrorVectorReal >::~StatisticsVector().

49 {
50  LOG_SCOPE ("minimum()", "StatisticsVector");
51 
52  const T min = *(std::min_element(this->begin(), this->end()));
53 
54  return min;
55 }
IterBase * end
long double min(long double a, double b)
template<typename T >
void libMesh::StatisticsVector< T >::normalize ( )

Divides all entries by the largest entry and stores the result

Definition at line 164 of file statistics.C.

References std::max(), and libMesh::Real.

Referenced by libMesh::StatisticsVector< ErrorVectorReal >::stddev().

165 {
166  const dof_id_type n = cast_int<dof_id_type>(this->size());
167  const Real max = this->maximum();
168 
169  for (dof_id_type i=0; i<n; i++)
170  (*this)[i] = static_cast<T>((*this)[i] / max);
171 }
virtual T maximum() const
Definition: statistics.C:61
long double max(long double a, double b)
DIE A HORRIBLE DEATH HERE typedef LIBMESH_DEFAULT_SCALAR_TYPE Real
uint8_t dof_id_type
Definition: id_types.h:64
template<typename T >
void libMesh::StatisticsVector< T >::plot_histogram ( const processor_id_type  my_procid,
const std::string &  filename,
unsigned int  n_bins 
)

Generates a Matlab/Octave style file which can be used to make a plot of the histogram having the desired number of bins. Uses the histogram(...) function in this class WARNING: The histogram(...) function is non-const, and changes the order of the vector.

Definition at line 270 of file statistics.C.

References std::max(), and std::min().

Referenced by libMesh::StatisticsVector< ErrorVectorReal >::stddev().

273 {
274  // First generate the histogram with the desired number of bins
275  std::vector<dof_id_type> bin_members;
276  this->histogram(bin_members, n_bins);
277 
278  // The max, min and bin size are used to generate x-axis values.
279  T min = this->minimum();
280  T max = this->maximum();
281  T bin_size = (max - min) / static_cast<T>(n_bins);
282 
283  // On processor 0: Write histogram to file
284  if (my_procid==0)
285  {
286  std::ofstream out_stream (filename.c_str());
287 
288  out_stream << "clear all\n";
289  out_stream << "clf\n";
290  //out_stream << "x=linspace(" << min << "," << max << "," << n_bins+1 << ");\n";
291 
292  // abscissa values are located at the center of each bin.
293  out_stream << "x=[";
294  for (std::size_t i=0; i<bin_members.size(); ++i)
295  {
296  out_stream << min + (i+0.5)*bin_size << " ";
297  }
298  out_stream << "];\n";
299 
300  out_stream << "y=[";
301  for (std::size_t i=0; i<bin_members.size(); ++i)
302  {
303  out_stream << bin_members[i] << " ";
304  }
305  out_stream << "];\n";
306  out_stream << "bar(x,y);\n";
307  }
308 }
virtual T maximum() const
Definition: statistics.C:61
virtual T minimum() const
Definition: statistics.C:48
long double max(long double a, double b)
virtual void histogram(std::vector< dof_id_type > &bin_members, unsigned int n_bins=10)
Definition: statistics.C:178
long double min(long double a, double b)
template<typename T>
virtual Real libMesh::StatisticsVector< T >::stddev ( ) const
inlinevirtual

Computes the standard deviation of the data set, which is simply the square-root of the variance.

Definition at line 164 of file statistics.h.

165  { return std::sqrt(this->variance()); }
virtual Real variance() const
Definition: statistics.h:144
template<typename T>
virtual Real libMesh::StatisticsVector< T >::stddev ( const Real  known_mean) const
inlinevirtual

Computes the standard deviation of the data set, which is simply the square-root of the variance. This method can be used for efficiency when the mean has already been computed.

Definition at line 173 of file statistics.h.

174  { return std::sqrt(this->variance(known_mean)); }
virtual Real variance() const
Definition: statistics.h:144
template<typename T>
virtual Real libMesh::StatisticsVector< T >::variance ( ) const
inlinevirtual

Computes the variance of the data set. Uses a recurrence relation to prevent data overflow for large sums. Note: The variance is equal to the standard deviation squared. Source: GNU Scientific Library

Reimplemented in libMesh::ErrorVector.

Definition at line 144 of file statistics.h.

Referenced by libMesh::StatisticsVector< ErrorVectorReal >::stddev(), and libMesh::StatisticsVector< ErrorVectorReal >::variance().

145  { return this->variance(this->mean()); }
virtual Real mean() const
Definition: statistics.C:74
virtual Real variance() const
Definition: statistics.h:144
template<typename T >
Real libMesh::StatisticsVector< T >::variance ( const Real  known_mean) const
virtual

Computes the variance of the data set where the mean is provided. This is useful for efficiency when you have already calculated the mean. Uses a recurrence relation to prevent data overflow for large sums. Note: The variance is equal to the standard deviation squared. Source: GNU Scientific Library

Reimplemented in libMesh::ErrorVector.

Definition at line 141 of file statistics.C.

References libMesh::Real.

142 {
143  const dof_id_type n = cast_int<dof_id_type>(this->size());
144 
145  LOG_SCOPE ("variance()", "StatisticsVector");
146 
147  Real the_variance = 0;
148 
149  for (dof_id_type i=0; i<n; i++)
150  {
151  const Real delta = ( static_cast<Real>((*this)[i]) - mean_in );
152  the_variance += (delta * delta - the_variance) /
153  static_cast<Real>(i + 1);
154  }
155 
156  if (n > 1)
157  the_variance *= static_cast<Real>(n) / static_cast<Real>(n - 1);
158 
159  return the_variance;
160 }
DIE A HORRIBLE DEATH HERE typedef LIBMESH_DEFAULT_SCALAR_TYPE Real
uint8_t dof_id_type
Definition: id_types.h:64

The documentation for this class was generated from the following files: