Week of May 2nd
What I did
- Monday: NN data-loading
- Tuesday: NN pre-processing
- Wednesday: NN pause. I need to test that my gaussian averaging method is valid enough to be used for training data
- Thursday: Generated real disordered particles and ran simulations
- Friday: Made particle averaged xanes of real disordered structure and compared it to gaussian averaged method. Still thinking about the results. I also create a github organization for my Brookhaven National Lab group and applied to two jobs today.
- Saturday and Sunday: Applied to jobs and did some SQL practice
What I learned
- Monday: I spent awhile thinking about data-loaders, but my dataset is all text. So there’s no real reason I shouldn’t just load everything into memory at once.
- Tuesday: In tensorflow, you can add a first layer of your NN that will cale0transform the input data in the same way each time. Solves data-leakage problems. It’s a cool solution
- Wednesday: I Frankensteined some old code I wrote to shift particles in random directions via a distribution with the code I recently wrote to generate many structures. The way it works is incredibly convoluted, but at the same time, quite elegant.
def gen_delta_rho_shift(df, shift_sigma):
df_temp = df.copy()
df_temp['shift_val'] = np.random.normal(loc=0, scale=shift_sigma, size=df_temp.shape[0])
df_temp['theta'] = 6.28*np.random.random_sample(df_temp.shape[0])
df_temp['phi'] = 6.28*np.random.random_sample(df_temp.shape[0])
# shift_val=radius of sphere project new point onto = distance of new disordered atom from original location
df_temp['x'] += round(df_temp.shift_val*np.sin(df_temp.phi)*np.cos(df_temp.theta), 5)
df_temp['y'] += round(df_temp.shift_val*np.sin(df_temp.phi)*np.sin(df_temp.theta), 5)
df_temp['z'] += round(df_temp.shift_val*np.cos(df_temp.phi), 5)
# turn to numpy array
x1 = df_temp.loc[:, 'x'].values
y1 = df_temp.loc[:, 'y'].values
z1 = df_temp.loc[:, 'z'].values
return x1, y1, z1
I’ve been thinking about this code, and I think there’s a relationship between the middle section of that code and the dot product with the unit vector. I could probably greatly improve the code’s efficiency if I figure this out.
- Thursday: Simulations complete. Ran them on the BNL cluster to save time.
- Friday: Learned about a software dev job and talked about what “agile” and “scrum” environments are actually like. Unclear if gaussian averaging method will actually work.
What I will do next
- Create new realistic disordered structures and compare to the gauss averaged structures