glearn.sample_data.generate_points#
- glearn.sample_data.generate_points(num_points, dimension=1, grid=False, a=None, b=None, contrast=0.0, seed=0)#
Generate a set of points in the unit hypercube.
Points can be generated either randomly or on a lattice.
The density of the distribution of the points can be either uniform over all unit hypercube, or can be more concentrated with a higher density in a smaller hypercube region embedded inside the unit hypercube.
- Parameters:
- num_pointsint
Determines the number of generated points as follows:
If
grid
is False,num_points
is the number of random points to be generated in the unit hypercube.If
grid
is True,num_points
is the number of points along each axes of a grid of points. Thus, the total number of points isnum_points**dimension
.
- dimensionint, default=1
The dimension of the space of points.
- gridbool, default=True
If True, it generates the set of points on a lattice grid. Otherwise, it randomly generates points inside the unit hypercube.
- afloat or array_like, default=None
The coordinate of a corner point of an embedded hypercube inside the unit hypercube. The point
a
is the closet point of the embedded hypercube to the origin. The coordinates of this point should be between the origin and the point with coordinates(1,1, ..., 1)
. If None, it is assumed thata
is the origin. Whendimension
is 1,a
should be a scalar.- bfloat or array_like, default=None
The coordinate of another corner point of an embedded hypercube inside the unit hypercube. The point
b
is the furthest point of the embedded hypercube from the origin. The coordinates of this point should be between the pointa
and the point(1,1, ..., 1)
. If None, it is assumed that the coordinates of this point is all 1. Whendimension
is 1`,b
should be a scalar.- contrastfloat, default=0.0
The extra concentration of points to be generated inside the embedding hypercube with the corner points
a
andb
. Contrast is the relative difference of the density of the points inside and outside the embedding hypercube region and is between 0 and 1. When set to 0, all points are generated inside the unit hypercube with uniform distribution, hence, there is no difference between the density of the points inside and outside the inner hypercube. In contrary, when contrast is set to 1, all points are generated only inside the embedding hypercube and no point is generated outside of the inner hypercube.- seedint, default=0
Seed number of the random generator, which can be a non-negative integer. If set to None, the result of the random generator is not repeatable.
- Returns:
- xnumpy.ndarray
A 2D array where each row is the coordinate of a point. The size of the array is
(n, m)
wherem
is thedimension
of the space, andn
is the number of generated points inside a unit hypercube. The numbern
is determined as follow:If
grid
is True, thenn = num_points**dimension
.If
grid
is False, thenn = num_points
.
See also
Notes
Grid versus Random Points:
Points are generated in a multi-dimensional space inside the unit hypercube, either randomly (when
grid
is False) or on a structured grid (ifgrid
is True).Generate Higher Concentration of Points in a Region:
The points are generated with uniform distribution inside the unit hypercube \([0, 1]^d\). However, it is possible to generate the points with higher concentration in a specific hypercube \(\mathcal{H} \subset [0, 1]^d\), which is embedded inside the unit hypercube. The coordinates of the inner hypercube \(\mathcal{H}\) is determined by its two opposite corner points given by the arguments
a
andb
.The argument
contrast
, here denoted by \(\gamma\), specifies the excessive relative density of points in \(\mathcal{H}\) compared to the rest of the unit hypercube. Namely, if \(\rho_{1}\) is the density of points inside \(\mathcal{H}\) and \(\rho_2\) is the density of points outside \(\mathcal{H}\), then\[\gamma = \frac{\rho_1 - \rho_2}{\rho_2}.\]If \(\gamma = 0\), there is no difference between the density of the points inside and outside \(\mathcal{H}\), hence all points inside \([0, 1]^d\) are generated with uniform distribution. If in contrary, \(\gamma = 1\), then the density of points outside \(\mathcal{H}\) is zero and no point is generated outside \(\mathcal{H}\). Rather, all points are generated inside this region.
Examples
Generate 100 random points in the 1-dimensional interval \([0, 1]\):
>>> from glearn.sample_data import generate_points >>> x = generate_points(100, dimension=1, grid=False) >>> x.shape (100, 1)
Generate 100 random points in the 1-dimensional interval \([0, 1]\) where \(70 \%\) more points are inside \([0.2, 0.5]\) and \(30 \%\) of the points are outside of the latter interval:
>>> from glearn.sample_data import generate_points >>> x = generate_points(100, dimension=1, grid=False, a=0.2, ... b=0.5, contrast=0.7)
Generate 100 random points on a 2-dimensional square \([0, 1]^2\) where \(70 \%\) more points are inside a rectangle of the points \(a=(0.2, 0.3)\) and \(b=(0.4, 0.5)\)
>>> from glearn.sample_data import generate_points >>> x = generate_points(100, dimension=2, grid=False, a=(0.2, 0.3), ... b=(0.4, 0.5), contrast=0.7)
Generate a two-dimensional grid of \(30 \times 30\) points in the square \([0, 1]^2\):
>>> from glearn.sample_data import generate_points >>> x = generate_points(30, dimension=2, grid=True) >>> x.shape (900, 2)