utils¶
Helper functions
|
Simple class that allow streaming reads from GZip files (from https://gist.github.com/beaufour/4205533). |
|
Return string representation of bytes. |
|
Remove leading "/" from object_name |
|
A generator that splits an array into chunks of desired byte size |
|
Return byte size of file-object |
Get AWS keys from environmental variables if available |
|
Get AWS keys from default S3fs location if available. |
|
|
Read AWS keys from S3fs configuration or environmental variables. |
|
Return the size of the S3 object in MB |
|
Check string to see if it has any glob magic |
Check if string has non-trivial glob pattern |
|
Check string to see if it has trivial glob magic (e.g. "path/*"). |
|
|
Make the path behave as expected when querying S3 with list_objects. |
|
Return the name of all objects in a list |
|
Join two or more pathname components, inserting SEPARATOR as needed. |
|
Print name, size, and creation date of objects in list. |
|
Fill a numpy n-d array with file-like object contents |
|
remove leading "/" from a string |
|
|
|
Convert a URI to a bucket, object name tuple. |
|
|
|
Clean URL names from a list. |
GzipInputStream¶
- class cottoncandy.utils.GzipInputStream(fileobj, block_size=16384)¶
Bases:
objectSimple class that allow streaming reads from GZip files (from https://gist.github.com/beaufour/4205533).
Python 2.x gzip.GZipFile relies on .seek() and .tell(), so it doesn’t support this (@see: http://bo4.me/YKWSsL).
Adapted from: http://effbot.org/librarybook/zlib-example-4.py
- __init__(fileobj, block_size=16384)¶
Initialize with the given file-like object.
@param fileobj: file-like object,
- next()¶
- read(size=0)¶
- readline()¶
- readlines()¶
- seek(offset, whence=0)¶
- tell()¶
bytes2human¶
- cottoncandy.utils.bytes2human(nbytes)¶
Return string representation of bytes.
- Parameters:
nbytes (int) – Number of bytes
- Returns:
human_bytes – Human readable byte size (e.g. “10.00MB”, “1.24GB”, etc.).
- Return type:
str
clean_object_name¶
- cottoncandy.utils.clean_object_name(input_function)¶
Remove leading “/” from object_name
This is important for compatibility with S3fs. S3fs does not list objects with a “/” prefix.
generate_ndarray_chunks¶
- cottoncandy.utils.generate_ndarray_chunks(arr, axis=None, buffersize=104857600)¶
A generator that splits an array into chunks of desired byte size
- Parameters:
arr (np.ndarray)
axis (int, None) – The axis along which to slice the array. If None is given, the array is chunked into ideal isotropic voxels.
buffersize (scalar) – Byte size of the desired array chunks
- Returns:
iterator – The object yields the tuple: (chunk_coordinates, chunk_data_slice)
chunk_coordinates: Indices of the current chunk along each dimension
chunk_data_slice: Data for this chunk
- Return type:
generator object
Notes
axis=Noneis WIP and only works well for near isotropic matrices.
get_fileobject_size¶
- cottoncandy.utils.get_fileobject_size(file_object)¶
Return byte size of file-object
- Parameters:
file_object (file object)
- Returns:
nbytes
- Return type:
int
get_key_from_environ¶
- cottoncandy.utils.get_key_from_environ()¶
Get AWS keys from environmental variables if available
- Returns:
ACCESS_KEY (str)
SECRET_KEY (str)
Notes
Reads AWS_ACCESS_KEY and AWS_SECRET_KEY
get_key_from_s3fs¶
- cottoncandy.utils.get_key_from_s3fs()¶
Get AWS keys from default S3fs location if available.
- Returns:
ACCESS_KEY (str)
SECRET_KEY (str)
Notes
Reads ~/.passwd-s3fs to get ACCESSKEY and SECRET KEY
get_keys¶
- cottoncandy.utils.get_keys()¶
Read AWS keys from S3fs configuration or environmental variables.
- Returns:
ACCESS_KEY (str)
SECRET_KEY (str)
get_object_size¶
- cottoncandy.utils.get_object_size(boto_s3_object)¶
Return the size of the S3 object in MB
- Parameters:
boto_s3_object (boto object)
- Returns:
object_size
- Return type:
float (in MB)
has_magic¶
- cottoncandy.utils.has_magic(s)¶
Check string to see if it has any glob magic
has_real_magic¶
- cottoncandy.utils.has_real_magic(s)¶
Check if string has non-trivial glob pattern
has_start_digit¶
- cottoncandy.utils.has_start_digit(s)¶
has_trivial_magic¶
mk_aws_path¶
- cottoncandy.utils.mk_aws_path(path)¶
Make the path behave as expected when querying S3 with list_objects.
xxx/yyy -> xxx/yyy/
xxx/ -> xxx/
xxx -> xxx/
/ -> ‘’
‘’ -> ‘’
objects2names¶
- cottoncandy.utils.objects2names(objects)¶
Return the name of all objects in a list
- Parameters:
objects (list (of boto3 objects))
- Returns:
object_names
- Return type:
list (of strings)
pathjoin¶
- cottoncandy.utils.pathjoin(a, *p)¶
Join two or more pathname components, inserting SEPARATOR as needed. If any component is an absolute path, all previous path components will be discarded. An empty last part will result in a path that ends with a separator.
print_objects¶
- cottoncandy.utils.print_objects(object_list)¶
Print name, size, and creation date of objects in list.
- Parameters:
object_list (list (of boto3 objects))
read_buffered¶
- cottoncandy.utils.read_buffered(frm, to, buffersize=64)¶
Fill a numpy n-d array with file-like object contents
- Parameters:
frm (buffer) – Object with a
readmethodto (np.ndarray) – Array to which the contents will be put
remove_root¶
- cottoncandy.utils.remove_root(string_)¶
remove leading “/” from a string
remove_trivial_magic¶
- cottoncandy.utils.remove_trivial_magic(s)¶
xxx/* -> xxx/
xxx/ -> xxx/
xxx//yyy/ -> xxx//yyy/
sanitize_metadata¶
- cottoncandy.utils.sanitize_metadata(metadict)¶
split_uri¶
- cottoncandy.utils.split_uri(uri, pattern='s3://', separator='/')¶
Convert a URI to a bucket, object name tuple.
‘s3://bucket/path/to/thing’ -> (‘bucket’, ‘path/to/thing’)
string2bool¶
- cottoncandy.utils.string2bool(mstring)¶
unquote_names¶
- cottoncandy.utils.unquote_names(object_names)¶
Clean URL names from a list.
- Parameters:
object_names (list (of strings))
- Returns:
clean_object_names
- Return type:
list (of strings)