Package it.unimi.dsi.law.big.stat
Class CorrelationIndex
java.lang.Object
it.unimi.dsi.law.big.stat.CorrelationIndex
- Direct Known Subclasses:
KendallTau
,WeightedTau
public abstract class CorrelationIndex extends Object
An abstract class providing basic infrastructure for all classes computing some correlation index
between two score big vectors, such as
KendallTau
.
Implementing classes have just to implement compute(double[][], double[][])
to get a
wealth of support method, including loading data in different formats and
parsing file
types.
-
Constructor Summary
Constructors Modifier Constructor Description protected
CorrelationIndex()
-
Method Summary
Modifier and Type Method Description abstract double
compute(double[][] v0, double[][] v1)
Computes the correlation between two score big vectors.double
compute(CharSequence f0, CharSequence f1, Class<?> inputType)
Computes the correlation between two score big vectors.double
compute(CharSequence f0, Class<?> inputType0, CharSequence f1, Class<?> inputType1)
Computes the correlation between two score big vectors.double
compute(CharSequence f0, Class<?> inputType0, CharSequence f1, Class<?> inputType1, boolean reverse)
Computes the correlation between two (possibly reversed) score big vectors.double
computeDoubles(CharSequence f0, CharSequence f1)
Computes the correlation between two score big vectors.double
computeDoubles(CharSequence f0, CharSequence f1, boolean reverse)
Computes the correlation between two (possible reversed) score big vectors.double
computeFloats(CharSequence f0, CharSequence f1)
Computes the correlation between two score big vectors.double
computeFloats(CharSequence f0, CharSequence f1, boolean reverse)
Computes the correlation between two (possibly reversed) score big vectors.double
computeInts(CharSequence f0, CharSequence f1)
Computes the correlation between two score big vectors.double
computeInts(CharSequence f0, CharSequence f1, boolean reverse)
Computes the correlation between two (possibly reversed) score big vectors.double
computeLongs(CharSequence f0, CharSequence f1)
Computes the correlation between two score big vectors.double
computeLongs(CharSequence f0, CharSequence f1, boolean reverse)
Computes the correlation between (possibly reversed) two score big vectors.static double[][]
loadAsDoubles(CharSequence f, Class<?> inputType, boolean reverse)
Loads a big vector of doubles, either in binary or textual form.
-
Constructor Details
-
CorrelationIndex
protected CorrelationIndex()
-
-
Method Details
-
compute
public abstract double compute(double[][] v0, double[][] v1)Computes the correlation between two score big vectors.Note that this method must be called with some care if you're right on memory. More precisely, the two arguments should be built on the fly in the method call, and not stored in variables, as the some of the argument arrays might be
null
'd during the execution of this method to free some memory: if the array is referenced elsewhere the garbage collector will not be able to collect it.- Parameters:
v0
- the first score big vector.v1
- the second score big vector; in asymmetric correlation indices, this should be the reference score.- Returns:
- the correlation.
-
computeDoubles
Computes the correlation between two score big vectors.- Parameters:
f0
- the binary file of doubles containing the first score big vector.f1
- the binary file of doubles containing the second score big vector.- Returns:
- the correlation.
- Throws:
IOException
-
computeDoubles
Computes the correlation between two (possible reversed) score big vectors.- Parameters:
f0
- the binary file of doubles containing the first score big vector.f1
- the binary file of doubles containing the second score big vector.reverse
- whether to reverse the ranking induced by the score big vectors by loading opposite values.- Returns:
- the correlation.
- Throws:
IOException
-
computeFloats
Computes the correlation between two score big vectors.- Parameters:
f0
- the binary file of floats containing the first score big vector.f1
- the binary file of floats containing the second score big vector.- Returns:
- the correlation.
- Throws:
IOException
-
computeFloats
Computes the correlation between two (possibly reversed) score big vectors.- Parameters:
f0
- the binary file of floats containing the first score big vector.f1
- the binary file of floats containing the second score big vector.reverse
- whether to reverse the ranking induced by the score big vectors by loading opposite values.- Returns:
- the correlation.
- Throws:
IOException
-
computeInts
Computes the correlation between two score big vectors.- Parameters:
f0
- the binary file of integers containing the first score big vector.f1
- the binary file of integers containing the second score big vector.- Returns:
- the correlation.
- Throws:
IOException
-
computeInts
Computes the correlation between two (possibly reversed) score big vectors.- Parameters:
f0
- the binary file of integers containing the first score big vector.f1
- the binary file of integers containing the second score big vector.reverse
- whether to reverse the ranking induced by the score big vectors by loading opposite values.- Returns:
- the correlation.
- Throws:
IOException
-
computeLongs
Computes the correlation between two score big vectors.- Parameters:
f0
- the binary file of longs containing the first score big vector.f1
- the binary file of longs containing the second score big vector.- Returns:
- the correlation.
- Throws:
IOException
-
computeLongs
Computes the correlation between (possibly reversed) two score big vectors.- Parameters:
f0
- the binary file of longs containing the first score big vector.f1
- the binary file of longs containing the second score big vector.reverse
- whether to reverse the ranking induced by the score big vectors by loading opposite values.- Returns:
- the correlation.
- Throws:
IOException
-
compute
Computes the correlation between two score big vectors.- Parameters:
f0
- the file containing the first score big vector.f1
- the file containing the second score big vector.inputType
- the input type.- Returns:
- the correlation.
- Throws:
IOException
-
compute
public double compute(CharSequence f0, Class<?> inputType0, CharSequence f1, Class<?> inputType1) throws IOExceptionComputes the correlation between two score big vectors.- Parameters:
f0
- the file containing the first score big vector.inputType0
- the input type of the first score big vector.f1
- the file containing the second score big vector.inputType1
- the input type of the second score big vector.- Returns:
- the correlation.
- Throws:
IOException
-
compute
public double compute(CharSequence f0, Class<?> inputType0, CharSequence f1, Class<?> inputType1, boolean reverse) throws IOExceptionComputes the correlation between two (possibly reversed) score big vectors.- Parameters:
f0
- the file containing the first score big vector.inputType0
- the input type of the first score big vector.f1
- the file containing the second score big vector.inputType1
- the input type of the second score big vector.reverse
- whether to reverse the ranking induced by the score big vectors by loading opposite values. they are assumed to be in binary format.- Returns:
- the correlation.
- Throws:
IOException
-
loadAsDoubles
public static double[][] loadAsDoubles(CharSequence f, Class<?> inputType, boolean reverse) throws IOExceptionLoads a big vector of doubles, either in binary or textual form.- Parameters:
f
- a filename.inputType
- the input type, expressed as a class:Double
,Float
,Integer
,Long
orString
to denote a text file.reverse
- whether to reverse the ranking induced by the score big vector by loading opposite values.- Returns:
- an array of double obtained reading
f
. - Throws:
IllegalArgumentException
- ifreverse
is true, the type is integer or long andInteger.MIN_VALUE
orLong.MIN_VALUE
, respectively, appear in the file, as we cannot take the opposite.IOException
-