galaxychop.models package

Module for dynamical decomposition models.

class galaxychop.models.Components(labels, ptypes, probabilities)[source]

Bases: object

Class of components resulting from dynamic decomposition.

This class creates the components of the galaxy from the result of the dynamic decomposition.

Parameters
  • labels (np.ndarray) – 1D array with the index of the component to which each particle belongs. Shape: (n,1).

  • ptypes (np.ndarray) – Indicates the type of particle: stars = 0, dark matter = 1, gas = 2. Shape: (n,1).

  • probabilities (np.ndarray or None) – 1D array with probabilities of the particles to belong to each component, in case the dynamic decomposition model includes them. Shape: (n,1). Otherwise it adopts the value None.

to_dataframe(attributes=None)[source]

Convert to pandas data frame.

This method builds a data frame of all parameters of Components.

Returns

DataFrame – DataFrame of all Components data.

Return type

pandas.DataFrame

class galaxychop.models.GalaxyDecomposerABC(*, cbins=(0.05, 0.005))[source]

Bases: object

Abstract class to facilitate the creation of decomposers.

This class requests the redefinition of three methods: get_attributes, get_rows_mask and split.

Parameters

cbins (tuple) – It contains the two widths of bins necessary for the calculation of the circular angular momentum. Shape: (2,). Dafult value = (0.05, 0.005).

abstract get_attributes()[source]

Attributes for the parameter space.

Returns

attributes – Particle attributes used to operate the clustering.

Return type

keys of ParticleSet class parameters

abstract get_rows_mask(X, y, attributes)[source]

Mask for the valid rows to operate clustering.

This method gets the mask for the valid rows to operate clustering.

Parameters
  • X (np.ndarray(n_particles, attributes)) – 2D array where each file it is a diferent particle and each column is a attribute of the particles. n_particles is the total number of particles.

  • y (np.ndarray(n_particles,)) – 1D array where is identified the nature of each particle: 0 = stars, 1 = dark matter, 2 = gas. n_particles is the total number of particles.

  • attributes (tuple) – Dictionary keys of ParticleSet class parameters with particle attributes used to operate the clustering.

Returns

mask – Mask only with valid values to operate the clustering.

Return type

nd.array(m_particles)

abstract split(X, y, attributes)[source]

Compute clustering.

Parameters
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Training instances to cluster.

  • y (Ignored) – Not used, present here for API consistency by convention.

Returns

  • labels (np.ndarray(m_particles)) – 1D array with the index of the clusters to which each particle belongs. m_particles is the total number of particles with valid values to operate the clustering.

  • probs (np.ndarray(m_particles) or None) – Probabilities of the particles to belong to each component, in case the dynamic decomposition model includes them. Otherwise it adopts the value None.

attributes_matrix(galaxy, attributes)[source]

Matrix of particle attributes.

This method obtains the matrix with the particles and attributes necessary to operate the clustering.

Parameters
  • galaxy (Galaxy class object) – Instance of Galaxy class.

  • attributes (keys of ParticleSet class parameters) – Particle attributes used to operate the clustering.

Returns

  • X (np.ndarray(n_particles, attributes)) – 2D array where each file it is a diferent particle and each column is a attribute of the particles. n_particles is the total number of particles.

  • y (np.ndarray(n_particles)) – 1D array where is identified the nature of each particle: 0 = STARS, 1=DM, 2=Gas. n_particles is the total number of particles.

complete_labels(X, labels, rows_mask)[source]

Complete the labels of all particles.

This method assigns the labels obtained from clustering to the particles used for this purpose. The rest are assigned as label=Nan.

Parameters
  • X (np.ndarray(n_particles, attributes)) – 2D array where each file it is a diferent particle and each column is a parameter of the particles. n_particles is the total number of particles.

  • labels (np.ndarray(m_particles)) – 1D array with the index of the clusters to which each particle belongs. m_particles is the total number of particles with valid values to operate the clustering.

  • rows_mask (nd.array(m_particles)) – Mask only with valid values to operate the clustering. m_particles is the total number of particles with valid values to operate the clustering.

Returns

new_labels – 1D array with the index of the clusters to which each particle belongs. Particles that do not belong to any of them are assigned the label Nan. n_particles is the total number of particles.

Return type

np.ndarray(n_particles)

complete_probs(X, probs, rows_mask)[source]

Complete the probabilities of all particles.

This method assigns the probabilities obtained from clustering to the particles used for this purpose, the rest are assigned as label=Nan. This method returns None in case the clustering method returns None probabilities.

Parameters
  • X (np.ndarray(n_particles, attributes)) – 2D array where each file it is a diferent particle and each column is a parameter of the particles. n_particles is the total number of particles.

  • probs (np.ndarray(n_cluster, m_particles)) – 2D array with probabilities of belonging to each component. n_cluster is the number of components obtained. m_particles is the total number of particles with valid values to operate the clustering.

  • rows_mask (nd.array(m_particles)) – Mask only with valid values to operate the clustering. m_particles is the total number of particles with valid values to operate the clustering.

Returns

new_probs – 2D array with probabilities of belonging to each component. n_cluster is the number of components obtained. n_particles is the total number of particles. Particles that do not belong to any component are assigned the label Nan. This method returns None in case the clustering method returns None probabilities.

Return type

np.ndarray(n_cluster, n_particles)

decompose(galaxy)[source]

Decompose method.

Assign the component of the galaxy to which each particle belongs. Validation of the input galaxy instance.

Parameters

galaxy (Galaxy class object) – Instance of Galaxy class.

Returns

Instance of the Component class, with the result of the dynamic decomposition.

Return type

Components

class galaxychop.models.DynamicStarsDecomposerMixin[source]

Bases: object

Dynamic Stars Decomposer Mixin Class.

This class redefines the get_row_mask method so that dynamic decomposition is performed using only stellar particles.

get_rows_mask(X, y, attributes)[source]

Note

Only stellar particles are used to carry out the dynamic decomposition. In addition, the parameters of the parameter space, where the dynamic decomposition is carried out, must have finite values.

Parameters
  • X (np.ndarray(n_particles, attributes)) – 2D array where each file it is a diferent particle and each column is a attribute of the particles. n_particles is the total number of particles.

  • y (np.ndarray(n_particles,)) – 1D array where is identified the nature of each particle: 0 = stars, 1 = dark matter, 2 = gas. n_particles is the total number of particles.

  • attributes (tuple) – Dictionary keys of ParticleSet class parameters with particle attributes used to operate the clustering.

Returns

mask – Mask only with valid values to operate the clustering.

Return type

nd.array(m_particles)

class galaxychop.models.JThreshold(*, cbins=(0.05, 0.005), eps_cut=0.6)[source]

Bases: galaxychop.models._base.DynamicStarsDecomposerMixin, galaxychop.models._base.GalaxyDecomposerABC

JThreshold class.

Implementation of galaxy dynamical decomposition model using only the circularity parameter. Tissera et al.(2012) 2, Marinacci et al.(2014) 3, Vogelsberger et al.(2014) 4, Park et al.(2019) 5 .

Parameters

eps_cut (float, default=0.6) – Cut-off value in the circularity parameter. Stellar particles with eps > eps_cut are assigned to the disk and stellar particles with eps <= eps_cut to the spheroid.

Notes

Index of the cluster each stellar particles belongs to:

Index=0: correspond to galaxy spheroid. Index=1: correspond to galaxy disk.

Examples

Example of implementation.

>>> import galaxychop as gchop
>>> galaxy = gchop.read_hdf5(...)
>>> galaxy = gchop.star_align(gchop.center(galaxy))
>>> chopper = gchop.JThreshold()
>>> chopper.decompose(galaxy)

References

2

Tissera, P. B., White, S. D. M., and Scannapieco, C., “Chemical signatures of formation processes in the stellar populations of simulated galaxies”, Monthly Notices of the Royal Astronomical Society, vol. 420, no. 1, pp. 255-270, 2012. doi:10.1111/j.1365-2966.2011.20028.x. https://ui.adsabs.harvard.edu/abs/2012MNRAS.420..255T/abstract

3

Marinacci, F., Pakmor, R., and Springel, V., “The formation of disc galaxies in high-resolution moving-mesh cosmological simulations”, Monthly Notices of the Royal Astronomical Society, vol. 437, no. 2, pp. 1750-1775, 2014. doi:10.1093/mnras/stt2003. https://ui.adsabs.harvard.edu/abs/2014MNRAS.437.1750M/abstract

4

Vogelsberger, M., “Introducing the Illustris Project: simulating the coevolution of dark and visible matter in the Universe”, Monthly Notices of the Royal Astronomical Society, vol. 444, no. 2, pp. 1518-1547, 2014. doi:10.1093/mnras/stu1536. https://ui.adsabs.harvard.edu/abs/2014MNRAS.444.1518V/abstract

5

Park, M.-J., “New Horizon: On the Origin of the Stellar Disk and Spheroid of Field Galaxies at z = 0.7”, The Astrophysical Journal, vol. 883, no. 1, 2019. doi:10.3847/1538-4357/ab3afe. https://ui.adsabs.harvard.edu/abs/2019ApJ…883…25P/abstract

check_eps_cut(attribute, value)[source]

Eps_cut value validator.

This method validates that the value of eps_cut is in the interval (-1,1).

get_attributes()[source]

Attributes for the parameter space.

Returns

attributes – Particle attributes used to operate the clustering.

Return type

keys of ParticleSet class parameters

Notes

In this model the parameter space is given by

eps: circularity parameter (J_z/J_circ).

split(X, y, attributes)[source]

Compute clustering.

Parameters
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Training instances to cluster.

  • y (Ignored) – Not used, present here for API consistency by convention.

Returns

  • labels (np.ndarray(m_particles)) – 1D array with the index of the clusters to which each particle belongs. m_particles is the total number of particles with valid values to operate the clustering.

  • probs (np.ndarray(m_particles) or None) – Probabilities of the particles to belong to each component, in case the dynamic decomposition model includes them. Otherwise it adopts the value None.

Notes

The attributes used by the model are described in detail in the class documentation.

class galaxychop.models.JHistogram(*, cbins=(0.05, 0.005), n_bin=100, digits=2, random_state=None)[source]

Bases: galaxychop.models._base.DynamicStarsDecomposerMixin, galaxychop.models._base.GalaxyDecomposerABC

JHistogram class.

Implementation of galaxy dynamical decomposition model described in Abadi et al. (2003) 1.

Parameters
  • n_bin (int, default=100) – Number of bins needed to build the circularity parameter histogram.

  • digits (int, default=2) – Number of decimals to which an array is rounded.

  • seed (int, default=None) – Seed to initialize the random generator.

Notes

Index of the cluster each stellar particles belongs to:

Index=0: correspond to galaxy spheroid. Index=1: correspond to galaxy disk.

Examples

Example of implementation of Abadi Model.

>>> import galaxychop as gchop
>>> galaxy = gchop.read_hdf5(...)
>>> galaxy = gchop.star_align(gchop.center(galaxy))
>>> chopper = gchop.JHistogram()
>>> chopper.decompose(galaxy)

References

1

Abadi, M. G., Navarro, J. F., Steinmetz, M., and Eke, V. R., “Simulations of Galaxy Formation in a Λ Cold Dark Matter Universe. II. The Fine Structure of Simulated Galactic Disks”, The Astrophysical Journal, vol. 597, no. 1, pp. 21–34, 2003. doi:10.1086/378316. https://ui.adsabs.harvard.edu/abs/2003ApJ…597…21A/abstract

get_attributes()[source]

Attributes for the parameter space.

Returns

attributes – Particle attributes used to operate the clustering.

Return type

keys of ParticleSet class parameters

Notes

In this model the parameter space is given by

eps: circularity parameter (J_z/J_circ).

split(X, y, attributes)[source]

Compute clustering.

Parameters
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Training instances to cluster.

  • y (Ignored) – Not used, present here for API consistency by convention.

Returns

  • labels (np.ndarray(m_particles)) – 1D array with the index of the clusters to which each particle belongs. m_particles is the total number of particles with valid values to operate the clustering.

  • probs (np.ndarray(m_particles) or None) – Probabilities of the particles to belong to each component, in case the dynamic decomposition model includes them. Otherwise it adopts the value None.

Notes

The attributes used by the Abadi model are described in detail in the class documentation.

class galaxychop.models.JEHistogram(*, cbins=(0.05, 0.005), n_bin=100, digits=2, random_state=None, n_bin_E=20)[source]

Bases: galaxychop.models._histogram.JHistogram

JEHistogram class.

Implementation of a modification of Abadi galaxy dynamical decomposition model using the circularity parameter and specific energy distribution.

Parameters
  • n_bin_E (int, default=20) – Number of bins needed to build the normalised specific energy histogram.

  • **kwargs (key, value mappings) – Other optional keyword arguments are passed through to JHistogram classes.

Notes

Index of the cluster each stellar particles belongs to:

Index=0: correspond to galaxy spheroid. Index=1: correspond to galaxy disk.

Examples

Example of the implementation of the modified Abadi model.

>>> import galaxychop as gchop
>>> galaxy = gchop.read_hdf5(...)
>>> galaxy = gchop.star_align(gchop.center(galaxy))
>>> chopper = gchop.JEHistogram()
>>> chopper.decompose(galaxy)
get_attributes()[source]

Attributes for the parameter space.

Returns

attributes – Particle attributes used to operate the clustering.

Return type

keys of ParticleSet class parameters

Notes

In this model the parameter space is given by

normalized_star_energy: normalized specific energy of the stars eps: circularity parameter (J_z/J_circ).

split(X, y, attributes)[source]

Compute clustering.

Parameters
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Training instances to cluster.

  • y (Ignored) – Not used, present here for API consistency by convention.

Returns

  • labels (np.ndarray(m_particles)) – 1D array with the index of the clusters to which each particle belongs. m_particles is the total number of particles with valid values to operate the clustering.

  • probs (np.ndarray(m_particles) or None) – Probabilities of the particles to belong to each component, in case the dynamic decomposition model includes them. Otherwise it adopts the value None.

Notes

The attributes used by the modified Abadi model are described in detail in the class documentation.

class galaxychop.models.KMeans(*, cbins=(0.05, 0.005), n_components=2, init='k-means++', n_init=10, max_iter=300, tol=0.0001, verbose=0, random_state=None, algorithm='auto')[source]

Bases: galaxychop.models._base.DynamicStarsDecomposerMixin, galaxychop.models._base.GalaxyDecomposerABC

KMeans class.

Implementation of Scikit-learn 6 K-means as a method for dynamically decomposing galaxies.

Parameters
  • n_components (int, default=2) – The number of clusters to form as well as the number of centroids to generate.

  • init ({‘k-means++’, ‘random’}, callable or array-like of shape) –

  • (n_clusters – Parameter of :py:class:k-Means class into scikit-learn library.

  • n_features) – Parameter of :py:class:k-Means class into scikit-learn library.

  • default="k-means++" – Parameter of :py:class:k-Means class into scikit-learn library.

  • n_init (int, default=10) – Parameter of :py:class:k-Means class into scikit-learn library.

  • max_iter (int, default=300) – Parameter of :py:class:k-Means class into scikit-learn library.

  • tol (float, default=0.0001) – Parameter of :py:class:k-Means class into scikit-learn library.

  • verbose (int, default=0) – Parameter of :py:class:k-Means class into scikit-learn library.

  • random_state (int, default=None) – Parameter of :py:class:k-Means class into scikit-learn library.

  • algorithm ({“auto”, “full”, “elkan”}, default="auto") – Parameter of :py:class:k-Means class into scikit-learn library.

Notes

More information for KMeans class:

https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html

Examples

Example of implementation of KMeans Model.

>>> import galaxychop as gchop
>>> galaxy = gchop.read_hdf5(...)
>>> galaxy = gchop.star_align(gchop.center(galaxy))
>>> chopper = gchop.KMeans()
>>> chopper.decompose(galaxy)

References

6

Pedregosa et al., Journal of Machine Learning Research 12, pp. 2825-2830, 2011. https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html

get_attributes()[source]

Attributes for the parameter space.

Returns

attributes – Particle attributes used to operate the clustering.

Return type

keys of ParticleSet class parameters

Notes

In this model the parameter space is given by

normalized_star_energy: normalized specific energy of the stars eps: circularity parameter (J_z/J_circ) eps_r: projected circularity parameter (J_p/J_circ).

split(X, y, attributes)[source]

Compute clustering.

Parameters
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Training instances to cluster.

  • y (Ignored) – Not used, present here for API consistency by convention.

Returns

  • labels (np.ndarray(m_particles)) – 1D array with the index of the clusters to which each particle belongs. m_particles is the total number of particles with valid values to operate the clustering.

  • probs (np.ndarray(m_particles) or None) – Probabilities of the particles to belong to each component, in case the dynamic decomposition model includes them. Otherwise it adopts the value None.

Notes

The attributes used by the kmeans model are described in detail in the class documentation.

class galaxychop.models.DynamicStarsGaussianDecomposerABC(*, cbins=(0.05, 0.005), covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=10, init_params='kmeans', weights_init=None, means_init=None, precisions_init=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10)[source]

Bases: galaxychop.models._base.DynamicStarsDecomposerMixin, galaxychop.models._base.GalaxyDecomposerABC

Dynamic Stars Gaussian Decomposer Class.

Parameters
  • covariance_type ({‘full’, ‘tied’, ‘diag’, ‘spherical’}, default="full") – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • tol (float, default=0.001) – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • reg_covar (float, default=1e-06) – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • max_iter (float, default=100) – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • n_init (int, default=10) – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • init_params ({‘kmeans’, ‘random’}, default="kmeans") – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • weights_init (array-like of shape (n_components, ), default=None) – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • means_init (array-like of shape (n_components, n_features), default=None) – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • precisions_init (array-like, default=None) – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • random_state (int, default=None) – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • warm_start (bool, default=False) – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • verbose (int, default=0) – Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • verbose_interval (int, default=10) –

get_attributes()[source]

Attributes for the parameter space.

Returns

attributes – Particle attributes used to operate the clustering.

Return type

keys of ParticleSet class parameters

Notes

In this model the parameter space is given by

normalized_star_energy: normalized specific energy of the stars eps: circularity parameter (J_z/J_circ) eps_r: projected circularity parameter (J_p/J_circ).

class galaxychop.models.GaussianMixture(*, cbins=(0.05, 0.005), covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=10, init_params='kmeans', weights_init=None, means_init=None, precisions_init=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10, n_components=2)[source]

Bases: galaxychop.models._gaussian_mixture.DynamicStarsGaussianDecomposerABC

GaussianMixture class.

Implementation of the method for dynamically decomposing galaxies described by Obreja et al.(2018) 7 .

Parameters
  • n_components (int, default=2) – The number of mixture components. Parameter of :py:class:GaussianMixture class into scikit-learn library.

  • **kwargs (key, value mappings) – Other optional keyword arguments are passed through to :py:class:GaussianMixture class into scikit-learn library.

Notes

More information for GaussianMixture class:

https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html

Examples

Example of implementation of Gaussian Mixture Model.

>>> import galaxychop as gchop
>>> galaxy = gchop.read_hdf5(...)
>>> galaxy = gchop.star_align(gchop.center(galaxy))
>>> chopper = gchop.GaussianMixture()
>>> chopper.decompose(galaxy)

References

7

Obreja, A., “Introducing galactic structure finder: the multiple stellar kinematic structures of a simulated Milky Way mass galaxy”, Monthly Notices of the Royal Astronomical Society, vol. 477, no. 4, pp. 4915-4930, 2018. doi:10.1093/mnras/sty1022. https://ui.adsabs.harvard.edu/abs/2018MNRAS.477.4915O/abstract

split(X, y, attributes)[source]

Compute clustering.

Parameters
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Training instances to cluster.

  • y (Ignored) – Not used, present here for API consistency by convention.

Returns

  • labels (np.ndarray(m_particles)) – 1D array with the index of the clusters to which each particle belongs. m_particles is the total number of particles with valid values to operate the clustering.

  • probs (np.ndarray(m_particles) or None) – Probabilities of the particles to belong to each component, in case the dynamic decomposition model includes them. Otherwise it adopts the value None.

Notes

The attributes used by the model are described in detail in the class documentation.

class galaxychop.models.AutoGaussianMixture(*, cbins=(0.05, 0.005), covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=10, init_params='kmeans', weights_init=None, means_init=None, precisions_init=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10, c_bic=0.1, n_jobs=None)[source]

Bases: galaxychop.models._gaussian_mixture.DynamicStarsGaussianDecomposerABC

AutoGaussianMixture class.

Implementation of the auto-gmm method for dynamically decomposing galaxies described by Du et al.(2019) 8 .

Parameters
  • c_bic (float, default=0.1) – Cut value of the criteria for the automatic choice of the number of gaussians.

  • n_jobs (int, default=None) –

  • **kwargs (key, value mappings) – Other optional keyword arguments are passed through to :py:class:GaussianMixture class into scikit-learn library.

Notes

Index of the cluster each stellar particles belongs to:

Index of the cluster each stellar particles belongs to. Index=0: correspond to galaxy stellar halo. Index=1: correspond to galaxy bulge. Index=2: correspond to galaxy cold disk. Index=3: correspond to galaxy warm disk.

Examples

Example of implementation of auto-gmm model.

>>> import galaxychop as gchop
>>> galaxy = gchop.read_hdf5(...)
>>> galaxy = gchop.star_align(gchop.center(galaxy))
>>> chopper = gchop.AutoGaussianMixture()
>>> chopper.decompose(galaxy)

References

8

Du, M., “Identifying Kinematic Structures in Simulated Galaxies Using Unsupervised Machine Learning”, The Astrophysical Journal, vol. 884, no. 2, 2019. doi:10.3847/1538-4357/ab43cc. https://ui.adsabs.harvard.edu/abs/2019ApJ…884..129D/abstract

split(X, y, attributes)[source]

Compute clustering.

Parameters
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Training instances to cluster.

  • y (Ignored) – Not used, present here for API consistency by convention.

Returns

  • labels (np.ndarray(m_particles)) – 1D array with the index of the clusters to which each particle belongs. m_particles is the total number of particles with valid values to operate the clustering.

  • probs (np.ndarray(m_particles) or None) – Probabilities of the particles to belong to each component, in case the dynamic decomposition model includes them. Otherwise it adopts the value None.

galaxychop.models.hparam(default, **kwargs)[source]

Create a hyper parameter for decomposers.

By design decision, hyper-parameter is required to have a sensitive default value.

Parameters
  • default – Sensitive default value of the hyper-parameter.

  • **kwargs – Additional keyword arguments are passed and are documented in attr.ib().

Returns

Return type

Hyper parameter with a default value.

Notes

This function is a thin-wrapper over the attrs function attr.ib().