Variables
Multi-dimensional stochastic variables for actuarial modeling.
This module provides the ProteusVariable class, which represents multi-dimensional stochastic variables commonly used in actuarial and risk modeling. A ProteusVariable can contain different types of stochastic objects across multiple dimensions, enabling complex risk factor modeling.
Key features: - Multi-dimensional stochastic variables with named dimensions - Support for various stochastic types (StochasticScalar, FreqSevSims, etc.) - Mathematical operations across dimensions and simulations - Correlation analysis and upsampling capabilities - Export functionality for analysis and reporting
- NOTE: The serialization/deserialization methods (from_csv, from_dict, from_series)
are currently incomplete and have significant limitations. A comprehensive codec system is planned to address these issues. See: https://github.com/ProteusLLP/proteusllp-actuarial-library/issues/22
The ProteusVariable is designed for actuarial applications such as: - Multi-factor risk modeling (e.g., frequency, severity, inflation) - Portfolio-level aggregation across risk dimensions - Scenario analysis with correlated risk factors - Capital modeling with interdependent variables
Example
>>> from pal.stochastic_scalar import StochasticScalar
>>> from pal.frequency_severity import FreqSevSims
>>>
>>> # Create a multi-dimensional risk variable
>>> risk_var = ProteusVariable(
... dim_name="insurance_risk",
... values={
... "frequency": StochasticScalar([10, 12, 8, 15]),
... "severity": StochasticScalar([5000, 6000, 4500, 7000]),
... "expense_ratio": StochasticScalar([0.3, 0.32, 0.28, 0.35])
... }
... )
>>> total_cost = (
... risk_var["frequency"]
... * risk_var["severity"]
... * (1 + risk_var["expense_ratio"])
... )
- class pal.variables.ProteusVariable(dim_name, values)[source]
Bases:
Generic[T]A generic, homogeneous container for multivariate variables in simulations.
ProteusVariable is a hierarchical structure that holds multiple variables of the SAME type (homogeneous container). Each instance must contain either all scalars, all vectors (like StochasticScalar), or all nested ProteusVariables - but never a mix of different types.
- Type Parameter:
- T: The type of values stored. By convention, T should be a ScalarOrVector
type (NumericLike | VectorLike), though the parameter is unconstrained to allow flexible type inference. Usage with non-ScalarOrVector types may not be fully supported by all operations.
Key Features: - Homogeneous: All values in a single instance must be the same type.
Like List[T], you cannot mix types within one container.
Type Safety: Operations like mean() return type T, preserving type information through the computation.
Nesting: ProteusVariable containing ProteusVariable enables hierarchical data structures (e.g., risks by region by peril)
Dictionary Access: Sub-elements accessed via [] notation with string keys or integer indices
Examples
>>> # Homogeneous scalar container >>> scalar_risks = ProteusVariable( ... dim_name="risk_amounts", ... values={"fire": 100000, "flood": 200000} # All int ... )
>>> # Homogeneous vector container >>> vector_risks = ProteusVariable( ... dim_name="stochastic_losses", ... values={ ... "fire": StochasticScalar([100, 200, 300]), ... "flood": StochasticScalar([150, 250, 350]) ... } # All StochasticScalar ... )
>>> # Homogeneous nested container >>> nested_risks = ProteusVariable( ... dim_name="regions", ... values={ ... "north": scalar_risks, ... "south": scalar_risks ... } # All ProteusVariable instances ... )
>>> # INVALID - mixing types not allowed >>> # mixed = ProteusVariable(values={"a": 100, "b": StochasticScalar([1])}) >>> # This would violate homogeneity and cause type errors
Note: Statistical operations should be performed using numpy and scipy functions directly on ProteusVariable instances. For example: - Use np.percentile(variable, p) - Use np.mean(variable) - Use pal.stats.tvar(variable, p)
- count(value)[source]
Count occurrences of value in the container.
Required for Sequence protocol compatibility.
- Return type:
- index(value, start=0, stop=None)[source]
Return index of first occurrence of value.
Required for Sequence protocol compatibility.
- Raises:
ValueError – If value is not found.
- Return type:
- get_value_at_sim(sim_no)[source]
Get values at specific simulation number(s).
- Parameters:
sim_no (
int|StochasticScalar) – Simulation index(es) to extract. Can be a single numeric value, a list of integers, or a VectorLike object such as StochasticScalar.- Return type:
ProteusVariable[Union[TypeVar(T),StochasticScalar]]- Returns:
A new ProteusVariable with values at the specified simulation indices.
- upsample(n_sims)[source]
Upsample the variable to the specified number of simulations.
- Return type:
- validate_freqsev_consistency(_is_nested=False)[source]
Validate that all FreqSevSims have consistent sim_index.
When a ProteusVariable contains multiple FreqSevSims objects, operations like sum() or aggregation require that all FreqSevSims have identical simulation indices for meaningful results. This method recursively checks for that consistency across nested ProteusVariable structures.
All leaf values in the ProteusVariable tree must be FreqSevSims with matching simulation indices. Nested ProteusVariable structures are supported and will be recursively validated.
Use this validation before performing aggregation operations on ProteusVariable instances containing FreqSevSims to ensure the results will be valid.
- Parameters:
_is_nested (
bool) – Internal parameter for tracking recursion depth. Do not set manually.- Returns:
- is_valid: True if all leaf values are FreqSevSims with matching sim_index,
or if there are 0 FreqSevSims (trivially consistent)
error_message: Empty string if valid, descriptive error message otherwise
- sim_index: Representative sim_index array if valid and FreqSevSims found,
None if no FreqSevSims or invalid
- Return type:
tuple[bool,str,ndarray[tuple[Any,...],dtype[Any]] |None]
Example
>>> freq_sev_1 = FreqSevSims([0, 1, 2], [10, 20, 30], 3) >>> freq_sev_2 = FreqSevSims([0, 1, 2], [15, 25, 35], 3) >>> var = ProteusVariable( ... "losses", {"fire": freq_sev_1, "flood": freq_sev_2} ... ) >>> is_valid, msg, sim_idx = var.validate_freqsev_consistency() >>> if is_valid: ... total = var.sum() # Safe to sum
- classmethod from_csv(file_name, dim_name, values_column, simulation_column='Simulation')[source]
Import a ProteusVariable from a CSV file.
This method currently has significant limitations and will be replaced with a more comprehensive serialization system.
Current Limitations: - Only supports one-dimensional variables - Always creates StochasticScalar values regardless of intended type - Cannot preserve generic type information through deserialization - No support for nested ProteusVariable structures
- Parameters:
- Return type:
- Returns:
ProteusVariable with StochasticScalar values loaded from the CSV
- TODO: Implement comprehensive codec system for proper serialization
See: https://github.com/ProteusLLP/proteusllp-actuarial-library/issues/22
- classmethod from_dict(data)[source]
Create a ProteusVariable from a dictionary.
This method currently has significant limitations and will be replaced with a more comprehensive serialization system.
Current Limitations: - Only supports one-dimensional variables - Always creates StochasticScalar values from float lists - Cannot preserve generic type information - No support for nested structures or other value types
- Parameters:
data (
dict[str,list[float]]) – Dictionary mapping dimension labels to lists of float values- Return type:
- Returns:
ProteusVariable with StochasticScalar values created from the data
- TODO: Implement comprehensive codec system for proper serialization
See: https://github.com/ProteusLLP/proteusllp-actuarial-library/issues/22
- classmethod from_series(data)[source]
Create a ProteusVariable from a pandas Series.
This method currently has significant limitations and will be replaced with a more comprehensive serialization system.
Current Limitations: - Only supports one-dimensional variables - Creates scalar values, not StochasticScalar - Cannot preserve generic type information - Limited to single simulation (n_sims=1)
- Parameters:
data (
Series) – Pandas Series with values to load- Return type:
- Returns:
ProteusVariable with scalar values from the Series
- TODO: Implement comprehensive codec system for proper serialization
See: https://github.com/ProteusLLP/proteusllp-actuarial-library/issues/22
- correlation_matrix(correlation_type='spearman')[source]
Compute correlation matrix between variables.
ProteusVariable
- class pal.variables.ProteusVariable(dim_name, values)[source]
Bases:
Generic[T]A generic, homogeneous container for multivariate variables in simulations.
ProteusVariable is a hierarchical structure that holds multiple variables of the SAME type (homogeneous container). Each instance must contain either all scalars, all vectors (like StochasticScalar), or all nested ProteusVariables - but never a mix of different types.
- Type Parameter:
- T: The type of values stored. By convention, T should be a ScalarOrVector
type (NumericLike | VectorLike), though the parameter is unconstrained to allow flexible type inference. Usage with non-ScalarOrVector types may not be fully supported by all operations.
Key Features: - Homogeneous: All values in a single instance must be the same type.
Like List[T], you cannot mix types within one container.
Type Safety: Operations like mean() return type T, preserving type information through the computation.
Nesting: ProteusVariable containing ProteusVariable enables hierarchical data structures (e.g., risks by region by peril)
Dictionary Access: Sub-elements accessed via [] notation with string keys or integer indices
Examples
>>> # Homogeneous scalar container >>> scalar_risks = ProteusVariable( ... dim_name="risk_amounts", ... values={"fire": 100000, "flood": 200000} # All int ... )
>>> # Homogeneous vector container >>> vector_risks = ProteusVariable( ... dim_name="stochastic_losses", ... values={ ... "fire": StochasticScalar([100, 200, 300]), ... "flood": StochasticScalar([150, 250, 350]) ... } # All StochasticScalar ... )
>>> # Homogeneous nested container >>> nested_risks = ProteusVariable( ... dim_name="regions", ... values={ ... "north": scalar_risks, ... "south": scalar_risks ... } # All ProteusVariable instances ... )
>>> # INVALID - mixing types not allowed >>> # mixed = ProteusVariable(values={"a": 100, "b": StochasticScalar([1])}) >>> # This would violate homogeneity and cause type errors
Note: Statistical operations should be performed using numpy and scipy functions directly on ProteusVariable instances. For example: - Use np.percentile(variable, p) - Use np.mean(variable) - Use pal.stats.tvar(variable, p)
- count(value)[source]
Count occurrences of value in the container.
Required for Sequence protocol compatibility.
- Return type:
- index(value, start=0, stop=None)[source]
Return index of first occurrence of value.
Required for Sequence protocol compatibility.
- Raises:
ValueError – If value is not found.
- Return type:
- get_value_at_sim(sim_no)[source]
Get values at specific simulation number(s).
- Parameters:
sim_no (
int|StochasticScalar) – Simulation index(es) to extract. Can be a single numeric value, a list of integers, or a VectorLike object such as StochasticScalar.- Return type:
ProteusVariable[Union[TypeVar(T),StochasticScalar]]- Returns:
A new ProteusVariable with values at the specified simulation indices.
- upsample(n_sims)[source]
Upsample the variable to the specified number of simulations.
- Return type:
- validate_freqsev_consistency(_is_nested=False)[source]
Validate that all FreqSevSims have consistent sim_index.
When a ProteusVariable contains multiple FreqSevSims objects, operations like sum() or aggregation require that all FreqSevSims have identical simulation indices for meaningful results. This method recursively checks for that consistency across nested ProteusVariable structures.
All leaf values in the ProteusVariable tree must be FreqSevSims with matching simulation indices. Nested ProteusVariable structures are supported and will be recursively validated.
Use this validation before performing aggregation operations on ProteusVariable instances containing FreqSevSims to ensure the results will be valid.
- Parameters:
_is_nested (
bool) – Internal parameter for tracking recursion depth. Do not set manually.- Returns:
- is_valid: True if all leaf values are FreqSevSims with matching sim_index,
or if there are 0 FreqSevSims (trivially consistent)
error_message: Empty string if valid, descriptive error message otherwise
- sim_index: Representative sim_index array if valid and FreqSevSims found,
None if no FreqSevSims or invalid
- Return type:
tuple[bool,str,ndarray[tuple[Any,...],dtype[Any]] |None]
Example
>>> freq_sev_1 = FreqSevSims([0, 1, 2], [10, 20, 30], 3) >>> freq_sev_2 = FreqSevSims([0, 1, 2], [15, 25, 35], 3) >>> var = ProteusVariable( ... "losses", {"fire": freq_sev_1, "flood": freq_sev_2} ... ) >>> is_valid, msg, sim_idx = var.validate_freqsev_consistency() >>> if is_valid: ... total = var.sum() # Safe to sum
- classmethod from_csv(file_name, dim_name, values_column, simulation_column='Simulation')[source]
Import a ProteusVariable from a CSV file.
This method currently has significant limitations and will be replaced with a more comprehensive serialization system.
Current Limitations: - Only supports one-dimensional variables - Always creates StochasticScalar values regardless of intended type - Cannot preserve generic type information through deserialization - No support for nested ProteusVariable structures
- Parameters:
- Return type:
- Returns:
ProteusVariable with StochasticScalar values loaded from the CSV
- TODO: Implement comprehensive codec system for proper serialization
See: https://github.com/ProteusLLP/proteusllp-actuarial-library/issues/22
- classmethod from_dict(data)[source]
Create a ProteusVariable from a dictionary.
This method currently has significant limitations and will be replaced with a more comprehensive serialization system.
Current Limitations: - Only supports one-dimensional variables - Always creates StochasticScalar values from float lists - Cannot preserve generic type information - No support for nested structures or other value types
- Parameters:
data (
dict[str,list[float]]) – Dictionary mapping dimension labels to lists of float values- Return type:
- Returns:
ProteusVariable with StochasticScalar values created from the data
- TODO: Implement comprehensive codec system for proper serialization
See: https://github.com/ProteusLLP/proteusllp-actuarial-library/issues/22
- classmethod from_series(data)[source]
Create a ProteusVariable from a pandas Series.
This method currently has significant limitations and will be replaced with a more comprehensive serialization system.
Current Limitations: - Only supports one-dimensional variables - Creates scalar values, not StochasticScalar - Cannot preserve generic type information - Limited to single simulation (n_sims=1)
- Parameters:
data (
Series) – Pandas Series with values to load- Return type:
- Returns:
ProteusVariable with scalar values from the Series
- TODO: Implement comprehensive codec system for proper serialization
See: https://github.com/ProteusLLP/proteusllp-actuarial-library/issues/22
- correlation_matrix(correlation_type='spearman')[source]
Compute correlation matrix between variables.
StochasticScalar
Stochastic scalar variables for Monte Carlo simulation.
Provides the StochasticScalar class for representing and manipulating scalar-valued stochastic variables in actuarial and risk modeling applications. Supports arithmetic operations, statistical functions, and numpy integration.
- class pal.stochastic_scalar.StochasticScalar(values)[source]
Bases:
ProteusStochasticVariableA class to represent a single scalar variable in a simulation.
- coupled_variable_group: CouplingGroup
- __init__(values)[source]
Initialize a stochastic scalar.
- Parameters:
values (
TypeAliasType) – An array of values that describe the distribution for the scalar variable.
- property ranks: StochasticScalar
Return the ranks of the variable.
- all()
Check if all values in the variable are True (non-zero).
- Return type:
- Returns:
True if all values are non-zero, False otherwise.
- any()
Check if any value in the variable is True (non-zero).
- Return type:
- Returns:
True if any value is non-zero, False otherwise.
- astype(dtype)
Convert the underlying values to a specified dtype.
- class pal.stochastic_scalar.StochasticScalar(values)[source]
Bases:
ProteusStochasticVariableA class to represent a single scalar variable in a simulation.
- coupled_variable_group: CouplingGroup
- __init__(values)[source]
Initialize a stochastic scalar.
- Parameters:
values (
TypeAliasType) – An array of values that describe the distribution for the scalar variable.
- property ranks: StochasticScalar
Return the ranks of the variable.
- all()
Check if all values in the variable are True (non-zero).
- Return type:
- Returns:
True if all values are non-zero, False otherwise.
- any()
Check if any value in the variable is True (non-zero).
- Return type:
- Returns:
True if any value is non-zero, False otherwise.
- astype(dtype)
Convert the underlying values to a specified dtype.