Commit e867a73e authored by Luis Fernandez Ruiz's avatar Luis Fernandez Ruiz
Browse files

python

- add img_process_dir
Readme
- update instructions
parent 3b11bb07
......@@ -14,10 +14,12 @@ This technique for characterize materials consists in beaming neutrons against a
neutrons and sample's atom, lead the first to scatter. This *scattering* is caught by a detector placed at the appropriated
distance. So we can say that in this technique, we have three main variables:
* **instrument's parameters:** for each experiment, they are **known**.
* **sample's parameters:** they are what scientists want to obtain analyzing the scattered image. They are **unknown**.
* **instrument's parameters:** for each experiment, they are **known**. e.g. distance, collimation, wavelength and
background.
* **sample's parameters:** they are what scientists want to obtain analyzing the scattered image. They are **unknown**.
e.g. radius for *Sphere* model, radius, shell, rho-shell... for *Core-Shell Sphere* one.
* **scattered image:** is the *neutron scattering* produced based on the sample and instrument parameters configuration.
As we are going to see later, there are different types of scattered image. Each one of them is more appropiated to study
As we are going to see later, there are different types of scattered images. Each one of them is more appropiated to study
a feature of the sample. This is the reason why we are going to classify images between several **types (clusters)**
**But... why is it necessary to apply deep learning and regression models here?**
......@@ -71,13 +73,33 @@ allow to study particle sizes (high wavelength), and the last ones atoms sizes (
**IMPORTANT: From now on, and in all the comments in the scripts, we can refer to different clusters by the number shown in the
figure. e.g. cluster 0='bad Guinier', cluster 1='Guinier',...**
## Workflow proposed
Once we have described the problem and the type of scattered images scientists want to obtain, we are going to describe
the workflow we have thought to implement into the real instruments. See image below:
![Figure 3](readme_figures/Workflow.png)
1. initial choice of instrumental parameters based on scientist's intuition about the sample.
2. set them up in the real instrument. It is going to produce a *real scattered image*, based on the instrument's
parameter fixed and the unknown sample's ones.
3. if it happens to be the type of images scientists want (not frequent), they have finished. If not, the image enter
as an input to our **trained convolutional neural network**.
4. the **CNN classify the real image** in one of the clusters we have talked above. With this information and the
instrument's parameters chosen, our **regression model** is going to **infer the values for sample's parameters**.
5. once we have predicted the sample's parameters, it is straightforward, to **choose an optimal set of instrumental
parameters** that give scientists the image's category they want. We set them into the real instrument.
6. the new real scattered image's category is going to be the one scientists want. And all this, after only **one
iteration**. Besides, we can offer scientists, an interval for their sample's parameters. This was done before analyzing
manually a real scattered image.
## Repository structure
The repository is organized in 4 folders:
### python
All the scripts in this folder, should also be in the same directory when the time to use them arrives.
All the scripts in this folder, should also be in the same directory when the time to use them arrives. The most important
scripts (because they execute all the others) are [retrain.py](python/retrain.py) (to train the CNN)
[GUI_SANS.py](python/GUI_SANS.py) (Graphical User Interface) and [main_SANS.py](python/main_SANS.py)
(testing our complete algorithm (classification and regresssion) in order to improve it).
Description of files are listed below:
* **GUI_SANS:** Graphical User Interface (not really fancy by the way...). It asks to the user to input by keyboard the following information:
* **GUI_SANS.py:** Graphical User Interface (not really fancy by the way...). It asks to the user to input by keyboard the following information:
* particle's model: whether the user is going to input a *"Sphere"* or a *"Core-Shell Sphere"* scattered image.
* whether the user is going to input a real or simulated image.
* scattered image's path he wants to transform into the desired type. If it is a real image, it is also required the
......@@ -89,24 +111,24 @@ Description of files are listed below:
parameters and the predicted sample's parameters in the same folder that original image is.
* **img_process:** inputting a single real scattered image, this script applies functions to it for making it more alike
* **img_process.py:** inputting a single real scattered image, this script applies functions to it for making it more alike
to simulated ones.
* **img_process_dir:** similar to [img_process.py](python/img_process.py) but applying it to a folder full of images.
* **label_classify_image_folder:** adaptation of [label_image.py](python/label_image.py). It serves to the same purpose
* **img_process_dir.py:** similar to [img_process.py](python/img_process.py) but applying it to a folder full of images.
* **label_classify_image_folder.py:** adaptation of [label_image.py](python/label_image.py). It serves to the same purpose
but for all the images in a folder (user specifies the path). To each image, it applies the [label_image.py](python/label_image.py)
script to classify it into a cluster. After that, it enters the classified image in a folder corresponding to the cluster
it has been classified. At the end, we are going to have a "subfolder tree" (one subfolder for each cluster) inside the
folder we have passed as input argument. Besides, it is going to generate a log file that we are going to use
in several scripts ([results.csv](doc_files/Sphere/results.csv)).
* **main_SANS:** it executes several scripts sequentially to:
1. generate simulated scattered images. [save_sim_sphere_coresphere.m](python/save_sim_sphere_coresphere.m)
* **main_SANS.py:** it executes several scripts sequentially to:
1. generate simulated scattered images. [save_sim_sphere_coresphere.m](matlab/save_sim_sphere_coresphere.m)
2. classify them into several clusters or types. [label_classify_image_folder.py](python/label_classify_image_folder.py)
3. plot distribution of images based on radius, wavelength, distance and classification. [plot_results.py](python/plot_results.py)
4. suggest new instrumental parameters to transform initial images (created in i.) into the desired type of image
(specified by the user)[multiv_multip_regression.py](python/multiv_multip_regression.py)
5. generate suggested images. [create_suggested_images.m](python/create_suggested_images.m)
(specified by the user) [multiv_multip_regression.py](python/multiv_multip_regression.py)
5. generate suggested images. [create_suggested_images.m](matlab/create_suggested_images.m)
6. classify them. [label_classify_image_folder.py](python/label_classify_image_folder.py).
* **misclassified_images:** to train a Convolutional Neural Network to classify images, previously, we have to create a
* **misclassified_images.py:** to train a Convolutional Neural Network to classify images, previously, we have to create a
"subfolder tree" structure in a directory. In each subfolder we are going to enter manually, all the images we think they
belong to the same type (cluster). When this is done, we apply [retrain.py](python/retrain.py) to the directory that
contains the subfolder tree. In this way, the Convolutional Neural Network learns from our classification but in this
......@@ -115,8 +137,8 @@ we want to receive the name of these misclassified images (setting *--print_misc
going to give us in the console the path of all these images. If we copy this log to a *.txt*, and we feed it to this
script, it is going to show us the "misclassified images" and it will give us the option of relocating the images (by
inputting a 'y' yes or 'n' no) to the subfolder that CNN has decided.
* **move_files:** several functions to move, extract information of the name and file's classification.
* **multiv_multip_regression:** reading a [results.csv](doc_files/Sphere/results.csv) file, it suggest a transformation
* **move_files.py:** several functions to move, extract information of the name and file's classification.
* **multiv_multip_regression.py:** reading a [results.csv](doc_files/Sphere/results.csv) file, it suggest a transformation
for all the images in a folder (user specifies the path) to convert them into the desired type. For doing so, it applies
a regression with:
* **independent variables:** instrument's parameters and the CNN classification of the image
......@@ -127,37 +149,51 @@ a regression with:
that, we know, they produce the type of image we want for the predicted sample's parameters. As an output, this script
creates a [suggest.csv](doc_files/Sphere/suggest.csv) file with the new instrument's parameters that are going to
transform each image to the desired type.
* **plot_results:** given a [results.csv](doc_files/Sphere/results.csv) file, it plots the distribution of images in it
* **plot_results.py:** given a [results.csv](doc_files/Sphere/results.csv) file, it plots the distribution of images in it
based on radius, wavelength, distance and cluster classification. It can only be applied to **'Sphere'** particle's model.
* **regression_radius_deep:** similar to [multiv_multip_regression.py](python/multiv_multip_regression.py). The main
* **regression_radius_deep.py:** similar to [multiv_multip_regression.py](python/multiv_multip_regression.py). The main
difference is that it uses deep learning as the regression method, and it can only infer one sample's parameter so it can
only be applied to *'Sphere'* particles (we only want to infer the *radius* in this kind of models).
* **GUI_imgs:** folder that contains the images that are shown in [GUI_SANS.py](python/GUI_SANS.py) to help user make
his/her decision.
**Tensorflow scripts:**
* **retrain:** transfer learning module to train and save a Convolutional Neural Network model. Tensorflow script. More info in
* **retrain.py:** transfer learning module to train and save a Convolutional Neural Network model. Tensorflow script. More info in
[retrain.py](https://github.com/tensorflow/hub/blob/master/examples/image_retraining/retrain.py).
* **label_image:** transfer learning script. It uses a model produced in [retrain.py](python/retrain.py) to classify
* **label_image.py:** transfer learning script. It uses a model produced in [retrain.py](python/retrain.py) to classify
individual images. More info in
[label_image.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/label_image.py).
### matlab
All the scripts in this folder, should also be in the same directory in order to be usable.
Among all the scripts, we are going to define the main ones. They call the other scripts in the folder.
* **save_sim_sphere_coresphere:** script developed by ILL scientists to generate simulated scattered images based on instrumental and
sample parameters. Next scripts are based on this one.
* **create_suggested_images:** create images whose description is specified in [suggest.csv](doc_files/Sphere/suggest.csv).
* **create_img_param:** script used in [multiv_multip_regression.py](python/multiv_multip_regression.py) for generating
* **save_sim_sphere_coresphere.m:** script developed by ILL scientists to generate simulated scattered images based on
instrumental (dist, col, wav,...) and sample parameters (radius, shell,...). Next scripts are based on this one.
* **create_suggested_images.m:** create images whose description is specified in [suggest.csv](doc_files/Sphere/suggest.csv).
* **create_img_param.m:** script used in [multiv_multip_regression.py](python/multiv_multip_regression.py) for generating
individual scattered images. For doing so we have to give it as input an array with parameters values with the following
structure: *['dist', dist_value, 'col', col_value, 'wav', wav_value, ...]*
### doc_files
It contains template files that are generated in python scripts.
* **results.csv:** file (csv format) produced in *label_classify_image_folder.py*. It contains info of classified images
(dist, collimation, wavelength, CNN prediction for this image, and the average prediction). It is used in
[regression_radius_multip.py](python/regression_radius_multip.py) and [regression_radius_deep.py](python/regression_radius_deep.py).
* **suggest.csv:** file (csv format). It contains: *dist*, *col*, *wavelength* suggested, and the *original radius*.
It is created in *regression_radius_multip.py* and it is used in
[create_suggested_images.m](matlab/create_suggested_images.m).
* **results.csv:** file (csv format) produced in [label_classify_image_folder.py](python/label_classify_image_folder.py)
after classify all the image in a folder into the subfolders corresponding to the cluster of each image. It contains
info of classified images:
* instrument's parameters: distance, collimation, wavelength, background
* CNN classification for this image, and the mean of classification. e.g. 80% type 1 (Guinier) and 20% type 2 (One ring)
gives a "CNN classification"=**1**(Guinier) and a "mean of classification"=0.8\*1+0.2\*2= **1.2**
* sample's parameters: radius for *Sphere* model, radius, shell, rho-shell... for *Core-Shell Sphere* one.
It is used in [GUI_SANS.py](python/GUI_SANS.py), [multiv_multip_regression.py](python/multiv_multip_regression.py)
and [regression_radius_deep.py](python/regression_radius_deep.py).
* **suggest.csv:** file (csv format). It is created in [multiv_multip_regression.py](python/multiv_multip_regression.py) and
[regression_radius_deep.py](python/regression_radius_deep.py) and it is used in [create_suggested_images.m](matlab/create_suggested_images.m).
The structure of the file is:
* instrument's parameters: distance, collimation, wavelength, background
* CNN classification for this image. e.g. 80% type 1 (Guinier) and 20% type 2 (One ring) gives a
"CNN classification"=**1**(Guinier)
* sample's parameters: radius for *Sphere* model, radius, shell, rho-shell... for *Core-Shell Sphere* one.
* error for sample's parameters: distance between real and predicted sample's parameter value.
* **Original_misclassified.txt:** file (.txt format) produced copying and pasting the logs produced in [retrain.py](python/retrain.py)
when option *--print_misclassified_images* is set to *True*. It is used in
[misclassified_images.py](python/misclassified_images.py) to reclassify images.
......@@ -171,19 +207,12 @@ resulting scattered image. This is the how usually simulated images are going to
the reader, in these examples, at the beginning of the name, between "[]", we have added the name and number of the cluster.
### models
Similar to [imgs folder](imgs). It contains two folders (one for each particle model). In each one of them we can find
It contains two folders (one for each particle model). In each one of them we can find
the following files:
* **[output_graph.pb](models/Sphere/output_graph.pb) and [output_labels.txt](models/Sphere/output_labels.txt):** contain all
the info related with the classification model. It is produced in [retrain.py](python/retrain.py). We referenced them
each time we want to classify images.
* **[output_graph.pb](models/Sphere/output_graph.pb) and [output_labels.txt](models/Sphere/output_labels.txt):** they
contain all the info related with the classification model. It is produced in [retrain.py](python/retrain.py).
We referenced them each time we want to classify images.
* **rf:** folder that contains the regression models: cluster_0.pkl, cluster_1.pkl, cluster_2.pkl... The number in the
name specifies the cluster for which that regression model applies. e.g. cluster_0.pkl applies to *"bad guinier's"*
scattered images (cluster 0).
## Workflow's summary
After presenting the structure of the repository, we are going to explain how to use all the files. For this purpose, we
are going to follow the steps we have done in the project.
##
'''
/*********************************************************************************************
Real images are very different to simulated ones. Dimmer colours and much more noise.
This script applies transformation to images. The objective is make more similar simulated and real images.
In this way, the CNN could classify real images better
------------------------------------------------------
Author: FERNANDEZ RUIZ Luis
Last modified: 20/05/2019
File: img_process_dir.py
*********************************************************************************************
'''
import argparse
import numpy as np
import os
from os.path import join
import cv2
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from skimage.color import rgb2gray
from skimage.feature import canny
import datetime
from time import strftime
'''
FUNCTION: plot_images
Objective: plot 6 images (2x3): original, orig with cluster, orig with edges in one row. And the analogous
in the next row but denoise
Arguments:
input:
category (str): category of image. It is display in the plot.
orig (numpy.ndarray) (3D image) (0 to 255): original image. We are going to transform it.
orig_clust (numpy.ndarray) (3D image) (0 to 255): transformation apply to 'orig'. We apply kmeans to group pixels in k clusters (we specify k).
orig_edge (numpy.ndarray) (2D image) (bool): transformation apply to 'orig'. We only keep the edges of image
denoise (numpy.ndarray) (3D image) (0 to 255): orig image but denoise.
denoise_clust (numpy.ndarray) (3D image) (0 to 255): analogous but denoise of orig_clust
denoise_edge (numpy.ndarray) (2D image) (bool): analogous but denoise of orig_edge
output:
void
'''
def plot_images(category, orig, orig_clust, orig_edge, denoise, denoise_clust, denoise_edge):
fig, ax = plt.subplots(nrows=2, ncols=3)
ax[0, 0].imshow(orig)
ax[0, 0].set_title("Original. Image category: %s" % (str(category)))
ax[0, 1].imshow(orig_clust)
ax[0, 1].set_title("Orig cluster")
ax[0, 2].imshow(orig_edge)
ax[0, 2].set_title("Orig edge")
ax[1, 0].imshow(denoise)
ax[1, 0].set_title("Denoise")
ax[1, 1].imshow(denoise_clust)
ax[1, 1].set_title("Denoise cluster")
ax[1, 2].imshow(denoise_edge)
ax[1, 2].set_title("Denoise edge")
plt.show()
'''
FUNCTION: obtain_clust_edge_pic
Objective: apply kmeans to an image. It classify each pixel by its colour in one of the k clusters we specify.
Arguments:
input:
img (numpy.ndarray) (3D image) (0 to 255): img we want to transform
clusters (int): number of clusters we want to make
sigma (float): it is used to define a lower and upper threshold. The edge pixels above the upper limit
are considered in an edge map and edge pixels below the threshold are discarded
output:
cluster_pic (numpy.ndarray) (3D image) (0 to 255): transformation apply to 'img'. We apply kmeans to group pixels in k clusters (we specify k).
edges (numpy.ndarray) (2D image) (bool): transformation apply to 'img'. We only keep the edges of image
'''
def obtain_clust_edge_pic(img, clusters, sigma):
# PART 1: OBTAINING CLUSTER IMAGE
# At the beginning each pixel has a value between 0 and 255
img_reduc = img / 255
# Transform image from 3D (x,y,z) to 2D (x*y,z). z is the number of channels (RGB)
pic_n = img_reduc.reshape(img_reduc.shape[0] * img_reduc.shape[1], img_reduc.shape[2])
# Get clusters of the image
kmeans = KMeans(n_clusters=clusters, random_state=0).fit(pic_n)
pic2show = kmeans.cluster_centers_[kmeans.labels_]
# Undo the transformation. i.e. pass from 2D (x*y,z) to 3D (x,y,z)
cluster_pic = pic2show.reshape(img_reduc.shape[0], img_reduc.shape[1], img_reduc.shape[2])
# PART 2: OBTAINING EDGE IMAGE
# first we transform image into a 2D one
gray = rgb2gray(cluster_pic)
# then we define a threshold center in the median of pixel values. Edge pixels above the upper limit
# are considered in an edge map and edge pixels below the threshold are discarded.
v = np.median(gray)
lower = int(max(0, (1.0 - sigma) * v))
upper = int(min(1, (1.0 + sigma) * v))
# Obtain edge image
edges = canny(gray, lower, upper)
# hist, hist_centers = histogram(gray)
# plt.plot(hist_centers, hist, lw=2)
return cluster_pic, edges
'''
FUNCTION: mark_edges
Objective: highlight edges of an image. We do this overlapping an edge image (edge image) on the original image
Arguments:
input:
img (numpy.ndarray) (3D image) (0 to 255): img which we want to mark the edges
edges (numpy.ndarray) (2D image) (bool): img that only have the edges of "img"
output:
img (numpy.ndarray) (3D image) (0 to 255): img which the edges marked
'''
def mark_edges(img, edges):
width, height, channels = img.shape
# iterate over each pixel
for x in range(width):
for y in range(height):
if edges[x,y] == True: # if in this pixel there is a border. i.e. edge=true
img[x,y,:] = 0 # we mark in black this part of the original image
return img
'''
FUNCTION: img_preprocess
Objective: 1) plot 3x2 matrix of images we describe in the header.
2) save transformed images (mark edges) in a directory
Arguments:
input:
path_imgs (string): path to the folder where imgs we want to transform are
path_save (string): folder where we want to save transformed images
clusters (int): number of clusters we want to make. READ 'obtain_clust_edge_pic'
sigma (float): it is used to define a lower and upper threshold. The edge pixels above the upper limit
are considered in an edge map and edge pixels below the threshold are discarded. READ 'obtain_clust_edge_pic'
output:
void
'''
def img_preprocess(path_imgs, path_save, clusters, sigma):
child_dir = os.listdir(path_imgs) # Get the folders in parent_dir
for child in child_dir:
# if child != 'original':
# continue
print("---------- Category: %s ----------" %(str(child)))
if os.path.isdir(join(path_imgs, child)): # we only deal with directories
files = [f for f in os.listdir(join(path_imgs, child))] # we look for the files inside the dir
# the 2 following variables are for display the progress
len_files = len(files)
count = 0
for file in files: # iterate over the files
count += 1
if count % 100 == 0: # we display once in 100 images
print("Progress: %i of %i. %i%%" % (
count, len_files, (100 * count) // len_files))
img_orig = cv2.imread(join(path_imgs, child, file)) # read the image
img_orig = cv2.cvtColor(np.uint8(img_orig), cv2.COLOR_BGR2RGB) # transform it to blue scale
img_denoise = cv2.fastNlMeansDenoisingColored(img_orig, None, 10, 10, 7, 21) # denoise the img
# obtain cluster and edge image of original and denoise images
orig_clust, orig_edge = obtain_clust_edge_pic(img_orig, clusters, sigma)
denoise_clust, denoise_edge = obtain_clust_edge_pic(img_denoise, clusters, sigma)
# display the results
plot_images(child, img_orig, orig_clust, orig_edge, img_denoise, denoise_clust, denoise_edge)
# create and save an image with the edges marked
img_with_edges = mark_edges(denoise_clust, denoise_edge)
img_with_edges_resize = cv2.resize(img_with_edges, IMG_SIZE)
plt.imsave(join(path_save, child, file), img_with_edges_resize)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--real_img_path", type=str, help="path to the folder where images are")
parser.add_argument("--save_img_path", type=str, help="path to the dir where you want to save output images")
parser.add_argument("--num_clusters", type=int, help="num cluster to classify image")
parser.add_argument("--sigma", type=float, help="it is used to define a lower and upper threshold. The edge pixels"
" above the upper limit are considered in an edge map and edge "
"pixels below the threshold are discarded")
args = parser.parse_args()
if args.real_img_path:
real_img_path = args.real_img_path
if args.save_img_path:
save_img_path = args.save_img_path
if args.num_clusters:
num_clusters = args.num_clusters
if args.sigma:
sigma = args.sigma
# Call function
IMG_SIZE = (434, 343)
img_preprocess(real_img_path, save_img_path, clusters=num_clusters, sigma=sigma)
'''
/*************************************************************
Script that call multiple MATLAB and PYTHON functions
in a sequential way. It follows all the process:
1) creation of images
2) classification of them
3) plot of classification (to see coherence)
4) making suggestions for improving them
5) creation of suggested images
6) classification of suggested images we have just created
Script that calls multiple MATLAB and PYTHON functions in a sequential way.
It follows all the following process for TESTING OUR ALGORITHM:
1) creation of images
2) classification of them
3) plot based on instrument's parameter and classification (to see coherence). Only for 'Sphere images'
4) making suggestions for improving them
5) creation of suggested images
6) classification of suggested images we have just created
------------------------------------------------------
......@@ -30,20 +30,20 @@ import datetime
# Define paths
scatter_model = "Sphere" # Particles model ('Sphere', 'Core-Shell Sphere')
sample_params_list = "[radius]"
sample_params_list = "[radius]" # Sample's parameter we want to study
img_path = "/home/dpt/fernandez-ruiz/sim/sim_data/Sphere/log_image/20190531_resize/"
results_file = "results.csv"
suggest_file = "suggest_improved_guinier.csv"
tmp_folder = "tmp_guinier"
save_img_sug_folder = "Classif_by_categ_improved_guinier"
python_script_path = "/users/fernandez-ruiz/scatteringimage/python/"
matlab_path = "/users/fernandez-ruiz/scatteringimage/matlab/"
retrain_model_path = "/home/dpt/fernandez-ruiz/TF_Results/sphere_7_improved/" # "output_labels.py",
# "output_graph.pb" should be in this folder. (files from retrain.py)
categ_search = "1"
num_clusters = "8"
radius_min = "19"
radius_max = "350"
results_file = "results.csv" # name of results file. It is going to be created inside 'img_path'
suggest_file = "suggest_improved.csv" # name of suggest file. It is going to be created inside 'img_path'
tmp_folder = "tmp" # tmp file. It is going to be created inside 'img_path'
save_img_sug_folder = "Classif_by_categ" # folder where we are going to create suggestes images. It is going to be created inside 'img_path'
python_script_path = "/users/fernandez-ruiz/scatteringimage/python/" # Where are python scripts. Remember that all of them should be in the same folder
matlab_path = "/users/fernandez-ruiz/scatteringimage/matlab/" # Where are matlab scripts. Remember that all of them should be in the same folder
retrain_model_path = "/home/dpt/fernandez-ruiz/TF_Results/sphere_7_improved/" # "output_labels.py", "output_graph.pb" should be in this folder. (files from retrain.py)
categ_search = "1" # [0: bad guinier, 1: good guinier, 2: one ring, 3: two or three rings, 4: four or five rings,
# 5: more than five rings, 6: bad background images, 7: background image]
num_clusters = "8" # number of clusters we are going to study
radius_min = "19" # we only consider images above this radius
radius_max = "350" # we only consider images below this radius
# Start timer
start = datetime.datetime.now()
......@@ -53,50 +53,65 @@ eng = matlab.engine.start_matlab()
eng.addpath(matlab_path, '-end')
print("Matlab instance open")
# # 1) GENERATE IMAGES
# eng.save_sim_sphere(img_path, nargout=0)
# print("Images succesfully generated")
# # 2) CLASSIFY THEM and write a log file with the CNN classification
subprocess.run(["python", join(python_script_path, "label_classify_image_folder.py"), "--dir", img_path,
# 1) GENERATE IMAGES in img_path folder
eng.save_sim_sphere(img_path, nargout=0)
print("Images succesfully generated")
# 2) CLASSIFY THEM into subfolders (one for each cluster) inside img_path. Write a results.csv file with the following info:
# * instrument's parameters: distance, collimation, wavelength, background
# * CNN classification for this image, and the mean of classification. e.g. 80% type 1 (Guinier) and 20% type 2 (One ring)
# gives a "CNN classification"=**1**(Guinier) and a "mean of classification"=0.8\*1+0.2\*2= **1.2**
# * sample's parameters: radius for *Sphere* model, radius, shell, rho-shell... for *Core-Shell Sphere* one.
subprocess.run(["python", join(python_script_path, "label_classify_image_folder.py"),
"--dir", img_path,
"--graph", join(retrain_model_path, "output_graph.pb"),
"--labels", join(retrain_model_path, "output_labels.txt"), "--input_layer", "Placeholder",
"--output_layer", "final_result", "--results_path", join(img_path, results_file),
"--labels", join(retrain_model_path, "output_labels.txt"),
"--input_layer", "Placeholder",
"--output_layer", "final_result",
"--results_path", join(img_path, results_file),
"--sample_params_list", *sample_params_list])
# print("Images well classified. Results.csv created")
# # 3) PLOT THE RESULTS to see if the classification looks well
# if scatter_model == "Sphere":
# subprocess.run(["python", join(python_script_path, "plot_results.py"),
# "--file_name", join(img_path, results_file), "--jitter", "0.01", "--multiple_graphs", "True"])
# # 4) SUGGEST dist, col and wav for transforming previous images into the category searched
# subprocess.run(["python", join(python_script_path, "multiv_multip_regression.py"),
# "--results_path", join(img_path, results_file),
# "--suggest_path", join(img_path, suggest_file),
# "--matlab_path", matlab_path,
# "--retrain_model_path", retrain_model_path,
# "--python_script_path", python_script_path,
# "--tmp_folder", join(img_path, tmp_folder),
# "--scatter_model", scatter_model,
# "--plot_result", "False",
# "--reg_type", "rf",
# "--categ_search", categ_search,
# "--one_hot_encoder", "True",
# "--save_reg_model", "False",
# "--duration", "long",
# "--radius_min", radius_min,
# "--radius_max", radius_max,
# "--num_cluster", num_clusters])
# print("Suggest.csv created")
print("Images well classified. Results.csv created")
# 3) PLOT THE RESULTS to see if the classification looks well. Given a results.csv files it plot all the images on it based
# on radius, distance, wavelength and CNN classification for each image. Only for 'Sphere' particle models (only apply to radius)
if scatter_model == "Sphere":
subprocess.run(["python", join(python_script_path, "plot_results.py"),
"--file_name", join(img_path, results_file), "--jitter", "0.01", "--multiple_graphs", "True"])
# 4) SUGGEST dist, col and wav for transforming previous images into the category searched. It creates a suggest.csv file
# with the following information:
# * instrument's parameters: distance, collimation, wavelength, background
# * CNN classification for this image. e.g. 80% type 1 (Guinier) and 20% type 2 (One ring) gives a
# "CNN classification"=**1**(Guinier)
# * sample's parameters: radius for *Sphere* model, radius, shell, rho-shell... for *Core-Shell Sphere* one.
# * error for sample's parameters: distance between real and predicted sample's parameter value.
subprocess.run(["python", join(python_script_path, "multiv_multip_regression.py"),
"--results_path", join(img_path, results_file),
"--suggest_path", join(img_path, suggest_file),
"--matlab_path", matlab_path,
"--retrain_model_path", retrain_model_path,
"--python_script_path", python_script_path,
"--tmp_folder", join(img_path, tmp_folder),
"--scatter_model", scatter_model,
"--plot_result", "False",
"--reg_type", "rf", # random forest
"--categ_search", categ_search,
"--one_hot_encoder", "True", # whether to use 'dist' and 'wav' as categorized variables
"--save_reg_model", "False", # whether to save or not the regression model
"--duration", "long", # longer but more accurate execution or shorter but less accurate
"--radius_min", radius_min,
"--radius_max", radius_max,
"--num_cluster", num_clusters])
print("Suggest.csv created")
# 5) CREATE SUGGESTED IMAGES to check if they are correct
# Create a temporal folder in which we are going to generate suggested images. First, if it exists, we delete it
if os.path.isdir(join(img_path, save_img_sug_folder)):
shutil.rmtree(join(img_path, save_img_sug_folder), ignore_errors=True)
os.mkdir(join(img_path, save_img_sug_folder))
for i in range(int(num_clusters)):
for i in range(int(num_clusters)): # we create a subfolder for each cluster
os.mkdir(join(img_path, save_img_sug_folder, str(i)))
# this matlab code is going to generate the images in suggest.csv file create into the subfolders we have just created.
eng.create_suggested_images(join(img_path, save_img_sug_folder), join(img_path, suggest_file), scatter_model, nargout=0)
print("Suggested images created")
......
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment